10.2 Write a program to read through the mbox-short.txt and figure out the distr
ID: 3661622 • Letter: 1
Question
10.2 Write a program to read through the mbox-short.txt and figure out the distribution by hour of the day for each of the messages. You can pull the hour out from the 'From ' line by finding the time and then splitting the string a second time using a colon.
Once you have accumulated the counts for each hour, print out the counts, sorted by hour as shown below.
I have this much done
name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
#open file in read mode
handle = open(name,"r")
#read all contnets of file
text=handle.readlines()
#close the file
handle.close
#loop each line of text
for line in text:
if not line.startswith('From'):continue
line=line.rstrip()
#split the line into words
words=line.split()
#if length of words is more than 5, than index will be valid.
if (len(words)>5):
pieces=words[5]
email=pieces.split(':')
else:continue
Explanation / Answer
def openFile(): fname = raw_input("Enter file name: ") if len(fname) < 1 : fname = "mbox-short.txt" try: fh = open(fname, 'r') except: print "Error opening file", fname quit() return fh def startsWith(): sw = raw_input("Enter line prefix to consider: ") if len(sw) < 1 : sw = "From" return sw def countTimes(lines,s): counts = dict() for line in lines: if line.startswith(s) and not line.startswith(s+':'): line = ((line.rstrip()).lstrip()).split() str = line[5] hour = str[0:str.find(":"):1] counts[hour] = counts.get(hour,0) + 1 return counts def sortTimes(d): lst = list() for key, val in d.items(): lst.append((key,val)) lst.sort() for val,key in lst: print val,key fh = openFile() sw = startsWith() dictionary = countTimes(fh,sw) t = sortTimes(dictionary)Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.