Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Recall that in separate chaining, we create a linked list (or array) at each pos

ID: 3574718 • Letter: R

Question

Recall that in separate chaining, we create a linked list (or array) at each position of the hash table. When multiple items collide at that position, we store them in that list/array. Since to find an item, we now have to perform a linear search through that "bucket", it's important to bound how many items are likely to be stored in any one bucket.

For the entire first problem, we assume that the load factor is 1, i.e., we are placing m items in m hash buckets. (Obviously, the guarantees will be better if the load factor is lower, but this way, we will avoid having an extra parameter you have to deal with.) Prove the following:

The probability that any hash bucket contains more than ln(m) items goes to 0 as m goes to infinity. (As a result, for large hash tables, all hash operations are very likely to run in time O(log m).)

[With a somewhat more complicated analysis, one can prove an upper bound of O(log m/log log m), and with more work also a lower bound of (log m/log log m). But you don't have to prove that.]

Explanation / Answer

Well, firstly any hash table has amortized O(1) complexity. Which means that the complexity of the hashtable can be of the ambiguity as the hashtable has several factors affecting its time which include

- Calculating the key function

- Handling collision and chaining.

Now, as per the statement to prove, for any hash bucket with load factor as 1, so for m elements we have m buckets. As m goes to infinity that means we are going to have infinite number of buckets to handle and as a result there would no chaining on any of the buckets making the probablity of the any bucket to have more than ln(m).

This can be explained as simply as if you have 10 elements given and load factor of 1, also the no of bucket increasing, so you would have 10 elements in 10 different buckets, each bucket containing one. thus for any two element there is no same bucket making the probability to be zero.