Sololearn: Learn to Code

+ 6

From https://www.cplusplus.com/reference/unordered_map/unordered_map/bucket/: "A bucket is a slot in the container's internal hash table to which elements are assigned based on the hash value of their key. Buckets are numbered from 0 to (bucket_count-1)." unordered_map::bucket_count() returns the no. of buckets, or in other words, the no. of slots of available in the hash table. By default, the bucket_count is 1, which means there is 1 slot available. When the size of the unordered_map becomes >1, the number of slots are increased to 13 (this number most probably depends on the hash function). When the size becomes >13, the slots are increased to 29. Try printing uom.size() in each iteration of the for loop. Disclaimer: I had never heard of the term 'bucket' before. All this I said is what I concluded from playing with bucket_count a bit just now. So even though it seems very plausible to me, it can be wrong.

22nd May 2021, 3:59 PM

XXX

+ 6

Hash table implementations tend to pick between two desirable qualities: 1 . maintaining a power-of-two bucket_count() (i.e. rounding any value passed to reserve up to the next power of two if necessary), so - a size_t value returned by the hash function can be mapped onto the range of buckets using a 1-CPU-cycle bitwise AND operation (e.g. 8 buckets -> hash value ANDed with 7); - this has the undesirable effect of chopping off the high-order bits from the hash value, so they don't help randomise the bucket placement - Visual C++ does this 2. maintaining a prime bucket_count() - this has the extremely desirable side-effect of having high-order bits in the hash value affect the bucket selection, so lower-quality (i.e. faster) hash functions still often manage a more equal, less-collision/clustering-prone, bucket placement - implemented naively, this forces the compiler to do a mod ("%") operation by a runtime-variable bucket_count(), which may take e.g. 40-90 CPU cycles, depending on the CPU. A faster alternative is use the index into the table of prime numbers used when sizing the hash table to switch into a mod operation by that hard-coded constant prime value, so the compiler can try to optimise the mod using bit shifts, subtractions or additions, or multiplications if that's necessary (you can see the kinds of optimisations possible in this snippet on godbolt( https://godbolt.org/z/K45cYE )) - GCC does this; I think clang does too. source : https://stackoverflow.com/a/65973668/12030775

23rd May 2021, 7:35 AM

Arsenic

+ 2

ChillPill I don't know about that. As I said before, I think it might be related hashing function used in the implementation. The reason I believe that is because on Clang, the output of your code is this """ 0 2 2 5 5 5 11 11 11 11 11 11 23 23 23 23 """ I have no idea in these things so I can't say anything for sure.

22nd May 2021, 5:36 PM

XXX

+ 2

ChillPill Hmm... interesting. Thanks for sharing. But the question still remains that why this rule? Because isn't finding the next prime number more expensive than simply multiplying by 2 (which can also be done faster using bitwise operation)?

23rd May 2021, 2:33 AM

XXX

- 2

I need a program that can generate 1000 random numbers and can sort them in decending order using bucket sort

24th May 2021, 6:54 AM

Akor cephas Ternenge