Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

I am having a very hard time understanding what hashing in generaleven relates t

ID: 3615827 • Letter: I

Question

I am having a very hard time understanding what hashing in generaleven relates to. My professor this semester didn't really give muchtime on the the material to explain it well enough to where I couldunderstand it before he have a programming assignment on it. Couldsomeone explain how hashing relates to anything practical inComputer Science.


The homework he gave us concerns hashing.
Here's the initial information given with the assignment:
Create a lookup table for the Java reserved words.

The given hash function is used to access the 73 hash tableaddresses (0..72).
-- Under normal circumstances select a prime number that is atleast twice as big as the list/data size, N.
hash( key ) = of all ASCII character values times 128divided by the hash table size
Example:
hash(“const”) = ((99 + 111 + 110 + 115 + 116) <<7) % PRIME_TABLE_SIZE = 10
hash(“import”) = ((105 + 109 + 112 + 111 + 114 + 116)<< 7) % PRIME_TABLE_SIZE = 39
hash(“a_5”) = ((97 + 95 + 53) << 7) %PRIME_TABLE_SIZE = 43
Define a complete hash table ADT using a C++ class named JavaHash.The JavaHash class manages the 50 Java reserved words using anarray based hash table that uses open addressing. The JavaHashclass automatically performs two operations when instantiated: 1)creates an empty hash table; 2) fills the hash table by reading andinserting the Java reserved words from a text file namedJRWORDS.DAT. After the hash table has been filled up it can then beused to: 1) search for reserved words; 2) view the contents of thehash table.
Notes:
-- Inserting Java reserved words into the hash table:
o Store the actual hashed address – stored address may bedifferent after collision resolution.
o Store the probe count – number of addresses probed/testedwhile performing the insert operation.
-- Searching for Java reserved words in the hash table:
o Update the success count when the reserved word has beensuccessfully located.
-- Collision resolution scheme:
o Linear probing with an address step size of 1

Here's what the instructor explects us to do with thatinformation:

Write a complete C++ program that uses the JavaHash class to countthe number of times each reserved word appears in a text file. Theinput to the program is a text file containing Java reserved words,Java identifiers, and numeric constants. Java identifiers startwith a letter followed by letters, digits, and the underscore. Javawords (reserved word or identifier) are case sensitive. Read oneline at a time from the file. Scan the input character by characterstopping at the end of each word skipping over the numericconstants. Each entry in the file is separated by at least oneblank space or the end of the current line. Each line contains zeroor more entries. Search for each word in the hash table. If areserved word is found, then update the success counter for thatreserved word. After the file has been fully processed the programoutputs/views the contents of the hash table.

With this all said. I am stuck with not really knowing how to writethe implentation for the class. I have the header file made alreadyand a general driver also.

Explanation / Answer

A hash function is any well-defined procedureor mathematical function that converts a large, possiblyvariable-sized amount of data into a small datum, usually a singleinteger that may serve as an index to an array. The values returnedby a hash function are called hash values,hash codes, hash sums, or simplyhashes.

Hash functions are mostly used to speed up table lookup or datacomparison tasks — such as finding items in a database,detecting duplicated or similar records in a large file, findingsimilar stretches in DNA sequences, and so on.

A hash function may map two or more keys to the same hash value.In many applications, it is desirable to minimize the occurrence ofsuch collisions, which means that the hash function must map thekeys to the hash values as evenly as possible. Depending on theapplication, other properties may be required as well. Although theidea was conceived in the 1950s, the design of good hash functionsis still a topic of active research.

Hash functions are related to (and often confused with)checksums, check digits, fingerprints, randomization functions,error correcting codes, and cryptographic hash functions. Althoughthese concepts overlap to some extent, each has its own uses andrequirements and is designed and optimised differently. TheHashKeeper database maintained by the National Drug IntelligenceCenter, for instance, is more aptly described as a catalog of filefingerprints than of hash values.

Hash tables

Hash functions are primarily used in hash tables, to quicklylocate a data record (for example, a dictionary definition) givenits search key (the headword). Specifically, the hash function isused to map the search key to the hash. The index gives the placewhere the corresponding record should be stored. Hash tables, inturn, are used to implement associative arrays and dynamicsets.

In general, a hashing function may map several different keys tothe same index. Therefore, each slot of a hash table is associatedwith (implicitly or explicitly) a set of records, rather than asingle record. For this reason, each slot of a hash table is oftencalled a bucket, and hash values are also calledbucket indices.

Thus, the hash function only hints at the record's location— it tells where one should start looking for it. Still, in ahalf-full table, a good hash function will typically narrow thesearch down to only one or two entries.