i need help with creating a hash table for sorting objects called cards There wi
ID: 672993 • Letter: I
Question
i need help with creating a hash table for sorting objects called cards
There will be two classes:
HashTable, which includes the hash array, public functions find() and add(), and maybe some utility functions.
Card (the data objects), containing a name (String) and number (int), display(), probably a constructor.
For this assignment, the "hash key" will be the number field of a Card object. This is the outside-world value which figures in the hashing calculations. It is the value used in find and add.
-- that is, find(176) will look for a Card object (in the hash table) with number==176. Of course, find() will return the whole Card; add() will accept a whole Card argument, from which it will extract the number field's value ....
Note: Hash Tables are discussed in Lafore, chapter 11.
Note: there is a HashTable (and HashMap) class in the Java standard library. We'll work with them later; Do Not use the library version now. Our aim is to learn how this works "under the hood".
Hash Table:
The Hash Table implementation uses an internal array to hold the items being stored. The size of the array should be about twice as big as the number of values we hope to store. Let's say, something like 10 would be good; we're just doing a demo project. (prime numbers are supposed to work better, but it's easier to calculate mod-10 in your head).
Note: To create the Hash Table array, you'll need to declare it:
Card[] hashArray;
and then create the actual object, as we do with any array
hashArray = new Card[10]; ((or whatever size you want))
This doesn't create the Cards. You still need a
jones = new Card(...);
for each Card. And the Card objects have to be plugged into the array (though this is done by add() ).
If this stuff with Card objects is confusing, then I suggest you start with using just simple int values in the Hash Table, like we did in class the first day (and in the animation). Then you have
int[] hashArray = new int[10];
Once this is working, you can then change over to Card objects.
Hash Index:
To use a [small] array, we need a trick.
For normal array indexing, the value itself is used as subscript: given j, I look in A[j].
Hashing preserves that rapid access, but a trick is used to reduce the large range of values of the keys into the proper range. This is done using the "%" (mod or remainder) operator. The value (a % b) is defined as the remainder when a is divided by b -- and it's an integer. As you can easily see, the remainder cannot be any larger than b, so if b is the size of the table, the result of the % operator is a valid subscript value, in range.
HowTo:
So here's the cookbook approach:
take the hash key (that external value, the number field from the Card being added or number presented to find();
calculate (key % table.length) to get the hash index.
Use the hash index to reference the hash table array.
Look at the array element's contents.
If it's empty (==null) and we're doing an add() operation, we're in luck.
If it's empty and we've been asked to find(), then we report failure (it's not there) ..
If it's not empty and we were asked to find(), we might be in luck; but not so fast. Check the actual hash key of the Card object in the array, see if equal to the hash key we're looking for.
If it matches, good. probably return the Card object.
If not, then we have a collision. Keep looking. (see below) You may or may not find it.
If it's not empty and it was an add() operation, then you have a collision and you'll have to look further for a place to put it.
If a collision occurs, then use the "linear probe" scheme: try again at the next slot of the hash table array, and then the one after that, etc. (don't forget to wrap around when you reach the end ... don't fall off the edge of the world!-). Eventually you should either find what you're looking for (find operation) or find an empty slot (add operation-- ok; find operation-- fail).
Test the program by first creating a bunch of Card objects and adding them into the hash table ... pick values that don't cause collisions at first. Then try finding values -- some of the ones you've entered, some which have not.
Maybe now do another enter, causing a collision.
Then go to a simple interactive scheme (using Scanner and nextInt()) and do some more testing.
Your program should print a bunch of debugging info, so you can tell it's working properly. (required)
Inserting a duplicate value is usually considered a problem. It would be good to test for this and report any problem.
Explanation / Answer
Collision is the condition where two records are stored in the same location. But two records cannot be stored in same memory addresses; the process of finding an alternate location is called collision resolution.
It is a condition in which a hash function gives the same hash code
or hash table location for two different keys. Consider a case where two passwords encrypt to same value -thus there are two passwords that can be used to enter the system. Suppose there are numbers of forms that needs to be placed in sorted order by first letter of their surname.
Collision Resolution Techniques
·Cuckoo Hashing.
In cuckoo hashing two memory accesses are used in lookups. After the first memory lookup becomes unsuccessful hashing is done by the second hash function. Comparatively it is much faster than
chained hashingfor small, cache resident hash tables. Also the bucketized versions of cuckoo hashing (variants that use buckets that contain more than one key) is much more faster than conventional methods and it is used for large hash tables, when space utilization is high .When storing a large set of distinct keys, comparative to the other traditional hashing schemes tables, including linear probing,the performance of the cuckoo hash table , for maintaining a large dictionary of integers in memory is recorded to be faster to build, search, and delete than the equivalent chained and array hash tables. Although the cuckoo hash table is space intensive relative to the array hash table. But it was not as scalable, since the number of slots available, their capacity, and the extra vacant slots needed to prevent an irresolvable collision bound the total number of distinct keys storable. Hence, in order to cater for an unexpected increase in the number of distinct keys processed, for example, the cuckoo hash table will need tobe resized which can be both expensive
and space intensive, particularly in a dynamic environment.
The bucketized cuckoo hash table was confirmed to be the slowest hash table, far inferior to both the chained and array hash tables if we consider the performance of hash tables under important skew access. Indeed, the greatest hash table for both
distinct and skew data distribution was linear probing, though it too is not a scalable option relation to the array hash table; and even though more space efficient than cuckoo hashing, it also requires a remaining of unfilled slots. 23
Two Way Chaining
The next method, Two Way Chaining, can be described as two instances of chained hashing. A key is inserted in one of the two hash tables, namely the one where it hashes to the shorter chain. Two way chaining is a novel hashing scheme that uses two independent truly uniform hash functions f and g to insert m keys into a hash table with n chains, where each key x is inserted into the shortest chain among the chains f(x) and g( x), breaking ties randomly. The twoway chaining paradigm is utilized to design efficient open addressing hashing schemes. If a longer list is needed, a rehash
must be performed
Linear probing method is used for resolving hash collisions of values of hash functions by sequentially searching the hash table for a free location. This methods performance is more sensitive to the input distribution as compare to other methods like double hashing which will be discussed later.
A problem with the linear probe method is primary clustering. In primary clustering blocks of data maypossibly be able to form collision.Several attempts may be required by any key that hashes into the cluster to resolve the collision
Open addressing hash tables be used to stock up the records straight inside the array. This approach is also known as closed hashing. This procedure is based on probing. A hash collision is resolute by probing, or searching through interchange locations in the array (the probe sequence) awaiting either the target record is establish, or an vacant array slot is establish, thats the sign of that there is no such key in the table
Linear probing
in which the interval between probes is fixedoften at 1.
Quadratic Probing
in which the interval between probes increases proportional to
the hash value (the interval thus increasing linearly and the indices are described by a quadratic function).
Double Hashing
in which the interval between probes is computed by another
hash function.
Advantages over Open Addressed hash tables
The elimination function is straightforward and resizing the table can be delayed for a greatly longer time because performance degrade more gracefully even when every slot is used this is a chief benefit of chained hash tables above open addressed hash
tables in that. In addition numerous chaining hash tables might not need resizing at all because performance degradation is linear as the table fills.But besides that chained hash tables inherit the disadvantages of linked lists. When storing small records, the overhead of the linked list can be important. Traversing a
linked list has poor cache performance is one more extra disadvantage of it.
Chaining: In chaining we use array indexes to store the values. If hash code of second value also points to the same index then we replace that index value with an linked list and all values pointing to that index are stored in the linked list and actual array index points to the head of the the linked list. But if there is only one hash code pointing to an index of array then the value is directly stored in that index. Same logic is applied while retrieving the values. This is used in Java HashMap/Hashtable to avoid collisions.
Linear probing: This technique is used when we have more index in the table then the values to be stored. Linear probing technique work on the concept of keep incrementing until you find the empty slot. The pseudo code looks like this..
index = h(k)
while( val(index) is occupied)
index = (index+1) mod n
Double hashing technique: In this technique we use two hashing functions h1(k) and h2(k). If the slot at h1(k) is occupied then the second hashing function h2(k) used to increment the index. The pseudo-code looks like this..
index = h1(k)
while( val(index) is occupied)
index = (index + h2(k)) mod n
Linear probing and double hashing techniques are part of open addressing technique and it can only be used if available slots are more then the number of items to be added. It takes less memory then chaining because there is no extra structure used here but its slow because of lot of movement happen till we find an empty slot. Also in open addressing technique when an item is removed from a slot we put an tombstone to indicate that the item is removed from here that is why its empty.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.