Home > Uncategorized > Hash Overflow due to 64 bit upcasting

## Hash Overflow due to 64 bit upcasting

Lately, I had to debug the following piece of code, where it caused overflow on the hash bucket design.  The code worked perfectly on a Windows machine while compiled for Win32, but failed to work on a Linux Mint x64 machine.  The code is listed below, which basically calculates hash value of an input 32 bit unsigned number, limiting the hash value to 2^10 (1Meg).

hash = ( fpArray*2654404609 )>>12; // Calculate the hash and limit the value to 2^20 (1 Meg)

When the input value for fpArray was 1724463449 (0x66C93959), the hash value generated was 1779068547 (0x6A0A6E83), which is more than (0x000FFFFF) to cause the hash bucket overflow.

unsigned hash = fpArray * 2654404609;
hash = hash >> 12;

When I rewrote the code like the above, the value of hash was 2800236889 (0xA6E83959).  Upon shifting right by 12 yields 638651 (0x0009BEBB), which is the correct and expected hash value.

Overall, the first snippet of code appears to be correct.  Do you see a problem there?  I couldn’t find the issue, until I recalled the 32bit vs 64bit difference.  If you carefully look at the multiplier 2654404609 (0x9E370001), although appears to be a valid 32 bit number, what is the default assignment of type to this number by the compiler?  If it was assigned 64bits, what would happen to the results?  To validate this, I changed the 2nd snippet as the following.

unsigned long hash = (unsigned long)fpArray * 2654404609;
hash = hash >> 12;
unsigned h2 = (unsigned)hash;

Now, when the input is the same 1724463449 (0x66C93959), the value of hash becomes 4577423727077636441 (0x3F8646A0A6E83959) and upon right shifting by 12 bits yields 1117535089618563 (0x0003F8646A0A6E83). Followed by downcasting to unsigned yield 1779068547 (0x6A0A6E83). Bingo!

So, what is happening here? While performing (fpArray * 2654404609), the computation is upcasted to 64bit computation by the 64 bit compiler.  So, what is the solution? Just put a “U” at the end of the constant.

hash = ( fpArray*2654404609U )>>12; // Calculate the hash and limit the value to 2^20 (1 Meg)
(or)
const unsigned multipler = 2654404609; // here U suffix is not needed as the constant is explicitly made unsigned
hash = ( fpArray * multiplier ) >> 12;

Now, the computation will happen with 32 bit numbers to get the expected outputs.

Lessons Learned here:

1. While using constants, beware of the upcasting and downcasting. So use proper suffixes like U, L, F etc.
2. Instead of using constants directly in expressions, use them as constant variables.
3. Be conscious about the compiler type and the assumptions made by the compiler in different build modes.
Tags: