performance optimization for hackerlerin_cehennemi

When hash table capacity is power of 2 in this case growth factor should be 2 or 4, main problem related to hash table is to growth as fast as possible. for e.g if last inserted data is 1 bit above of 2GB table should be increased 4GB.
Redis uses this strategy for hashtable implementation:
setting/checking (to make sure capacity is power of two) of capacity:
	unsigned long realsize = _dictNextPower(size);
    /* Rehashing to the same table size is not useful. */
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* Allocate the new hash table and initialize all pointers to NULL */
    n.size = realsize;
    n.sizemask = realsize-1;

Generate index to get HT entry:
    /* Get the index of the new element, or -1 if
     * the element already exists. */
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;

static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
...
dx = hash & d->ht[table].sizemask; (or it could be hash % d->ht[table].sizemask, gcc would generate AND instruction)
...
_dictKeyIndex does & operation with sizemask because it has confidence about capacity that is power of two. But it scales too fast capacity of HT. In case of using prime table size then compiler generates more instructions for modulus operation:
        ...
		movsx   rdx, eax
        imul    rdx, rdx, 1431655766
        shr     rdx, 32
        mov     esi, eax
        sar     esi, 31
        mov     ecx, edx
        sub     ecx, esi
        mov     edx, ecx
        add     edx, edx
        add     edx, ecx
        sub     eax, edx
		...