This needs to be done in Python. I have no idea where to begin, as I have never

ID: 3793045 • Letter: T

Question

This needs to be done in Python. I have no idea where to begin, as I have never used Python before. Can someone give me an idea of where to start or how I should begin the code for this problem? Thanks

Consider the following hash function. Let U be the universe of strings composed of the characters from the alphabet Y DA, ...,z], and let the function f(z) return the index of a letter zi E y, e.g., A 1 and f(z) 26. Finally, for an ma-character string z E Sm, define h (r) 1 f(zi)] mod e), where l is the number of buckets in the hash table. That is, our hash function sums up the index values of the characters of a string and maps that value onto one of the l buckets. (a) The following list contains US Census derived last names: http:// census.gov/topics/genealogy/ 1990surnames/dist.all.last Using these names as input strings, first choose a uniformly random 50% of these name strings and then hash them using h (z). Produce a histogram showing the corresponding distribution of hash locations when l 200. Label the axes of your figure. Brief description what the figure shows about h(a); justify your results in terms of the behavior of h(a). Do not forget to append your code Hint: the raw file includes information other than the name strings, which will need to be removed; and, think about how you can count hash locations without building or using a real hash table.

Explanation / Answer

import matplotlib.pyplot as plt
import numpy as np
import random

def get_hash_index(last_name, bucket):
hash = 0
for c in last_name:
hash += ord(c) - ord('A') + 1
hash = hash % bucket
return hash

last_name_hash = {}

bucket_size = 200

for i in xrange(0, bucket_size):
last_name_hash[i] = 0

names_list = []
with open("dist.all.last.txt") as fp:
for line in fp:
last_name = line.split()[0]
names_list.append(last_name)

k = len(names_list)/2
indicies = random.sample(xrange(len(names_list)), k)
last_names_list = [names_list[i] for i in indicies]

for name in last_names_list:
hash_value = get_hash_index(name,bucket_size)
last_name_hash[hash_value] += 1

print last_name_hash

hash_as_list = []

for key in sorted(last_name_hash.iterkeys()):
hash_as_list.append(last_name_hash[key])

print hash_as_list

b = range(0,bucket_size)
plt.bar(b, hash_as_list)
#plt.hist(hash_as_list, bins=b)
plt.title("Histogram showing distribution of hash of last names")
plt.xlabel("Bucket number")
plt.ylabel("Size of bucket")
plt.show()

# in case indentation is messed up. Access http://pastebin.com/hKd5jSDf

Navigate

This needs to be done in Netbeans! In this lab, you will extend your Account Cla

This needs to be done in UNIX. THe command using awk to Print the Last Name, Fir

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

This needs to be done in Python. I have no idea where to begin, as I have never

Question

Explanation / Answer

Related Questions

Navigate