Double Hashing

Data Structure and Algorithm Analysis in C++

5.4.3 Double Hashing

In the beginning of chapter 5.4, the formula h_i(x) = (hash(x) + f(i)) is mentioned and f(i) is defined as f(i) = i ;

Here f(i) is defined as f(i) = i * hash₂(x) and hash₂(x) = R - (x mod R) where R is a prime smaller than TableSize. If R = 7 then hash₂(x) = 7 - (x mod 7).

The case it uses is as followed. Figure 5.18 shows the result of inserting keys {89, 18, 49, 58, 69} into a hash table.

Hash(x), or hash₁(x) to be precise, is defined as hash₁(x) = x % 10 // TableSize = 10.

So if there's no collision, h(x) = hash₁(x); otherwise h_i(x) = { hash₁(x) + i * hash₂(x) } % TableSize .

The first collision occurs when 49 is inserted. hash₂(49) = 7 − 0 = 7, so 49 is inserted in position 6 // { 49%10 + (7 - 1* 49%7) } % 10 .

hash₂(58) = 7 − 2 = 5, so 58 is inserted at location 3 // ( 8 + 1*5 )%10 .

Finally, 69 collides and is inserted at a distance hash₂(69) = 7− 6 = 1 away.

If we tried to insert 60 in position 0, we would have a collision. Since hash2(60) = 7 − 4 = 3, we would then try positions 3, 6, 9, and then 2 until an empty spot is found.

It is essential to guarantee that the TableSize is Prime when double hashing is used.

For instance, if we want to insert 23 into the table, it first collides with 58, then hash₂(23) = 7- 2 = 5，h₁(23) = 8, where 18 has already taken; h₂(23) = 3, h₃(23) = 8....

An infinite loop...resulting from the tablesize 10.