You have to write a Python program (3.5) which take two text files. Start with t

ID: 3803735 • Letter: Y

Question

You have to write a Python program (3.5) which take two text files. Start with the "wordfreq.py" script . As presented, it only calculates the frequency of each individual word in a text file, stored in a dictionary structure. Keep the input question that asks how many results you want displayed (n).

Note: Since you’ll be doing many of the same operations twice, you should revise the code to write some sensibly-designed reusable functions instead of simply duplicating each chunk of code and changing variable names.

Revise and enhance the program so that the program ALSO does these things:

1. Ask the user for a second text file to compare with the first one. I’ll refer to the files “A” and “B” below.

2. Calculate the word frequency for text “B” in the same way it does for “A”.

3. For both files, compute and print out how many total words it contains and how many distinct words they contain.

4. For both files, print out the n most frequent individual words (sorted like the provided example already does), but also showing the percentage of the total words each represents in its file. This is simple to calculate, as: frequency_of_word / total_words * 100. Round that to 2 decimal places.

5. Last, as a simple comparison of the texts, your program should print all the words that occured more than once in text “A” but not at all in “B” and vice-versa.

Explanation / Answer

import collections
import glob

#1. for printing word count of file1.txt, similarly change the filename as file2 for finding for second file
wordcount = collections.Counter()
with open("file1.txt") as file:
for line in file:
wordcount.update(line.split())

for k,v in wordcount.iteritems():
print k, v

#2.and 5 compute and print out how many total words it contains and how many distinct words they contain.
f1 = open("file1.txt", 'r')
f2 = open("file2.txt", 'r')

words1 = f1.read().split()
words2 = f2.read().split()
words = set(words1) & set(words2)
with open('outfile.txt', 'w') as output:
for word in words:
output.write('{} appears {} times in file1 and {} times in file2. '.format(word, words1.count(word), words2.count(word)))

#3.for reading second text file and comparing use glob. It conmbines two or more .txt files into one
read_files = glob.glob("*.txt") #It joing all the .txt files in the current working directory,code should be in same directory
with open("result.txt", "wb") as outfile:
for f in read_files:
with open(f, "rb") as infile:
outfile.write(infile.read())

with open("result.txt") as file:
for line in file:
wordcount.update(line.split().most_common())
#Counter(test.split()).most_common() #can use this too

for k,v in wordcount.iteritems():
print k, v

Navigate

You have to write a Python program (3.5) which take two text files and compare t

You have to write a Python program (3.5) which take two text files. Start with t

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

You have to write a Python program (3.5) which take two text files. Start with t

Question

Explanation / Answer

Related Questions

Navigate