Advertisement
emin_int11

performance optimization for hackerlerin_cehennemi

Feb 27th, 2021
145
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
text 1.62 KB | None | 0 0
  1. When hash table capacity is power of 2 in this case growth factor should be 2 or 4, main problem related to hash table is to growth as fast as possible. for e.g if last inserted data is 1 bit above of 2GB table should be increased 4GB.
  2. Redis uses this strategy for hashtable implementation:
  3. setting/checking (to make sure capacity is power of two) of capacity:
  4. unsigned long realsize = _dictNextPower(size);
  5. /* Rehashing to the same table size is not useful. */
  6. if (realsize == d->ht[0].size) return DICT_ERR;
  7.  
  8. /* Allocate the new hash table and initialize all pointers to NULL */
  9. n.size = realsize;
  10. n.sizemask = realsize-1;
  11.  
  12. Generate index to get HT entry:
  13. /* Get the index of the new element, or -1 if
  14. * the element already exists. */
  15. if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
  16. return NULL;
  17.  
  18. static long _dictKeyIndex(dict *d, const void *key, uint64_t hash, dictEntry **existing)
  19. ...
  20. dx = hash & d->ht[table].sizemask; (or it could be hash % d->ht[table].sizemask, gcc would generate AND instruction)
  21. ...
  22. _dictKeyIndex does & operation with sizemask because it has confidence about capacity that is power of two. But it scales too fast capacity of HT. In case of using prime table size then compiler generates more instructions for modulus operation:
  23. ...
  24. movsx rdx, eax
  25. imul rdx, rdx, 1431655766
  26. shr rdx, 32
  27. mov esi, eax
  28. sar esi, 31
  29. mov ecx, edx
  30. sub ecx, esi
  31. mov edx, ecx
  32. add edx, edx
  33. add edx, ecx
  34. sub eax, edx
  35. ...
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement