Write a Python program that counts the frequencies of each word in a text. We de
ID: 3677388 • Letter: W
Question
Write a Python program that counts the frequencies of each word in a text. We define a word as a contiguous sequence of non-white-space characters. Different capitalizations of the same character sequence should be considered same word (e.g. Python and python). The output is formatted as follows: each line begins with a number indicating the frequency of the word, a white space, then the word itself, and a list of line numbers containing this word. You should output from the most frequent word to the least frequent. In case two words have the same frequency, the lexicographically smaller one comes first. All words are in lower case in the output.
Explanation / Answer
inputStr = """Write a Python program that counts the frequencies of each word in a text . We define
a word as a contiguous sequence of non-white-space characters . Different capitalizations
of the same character sequence should be considered same word (e .g . Python and python) .
The output is formatted as follows: each line begins with a number indicating the
frequency of the word, a white space, then the word itself, and a list of line numbers
containing this word . You should output from the most frequent word to the least
frequent . In case two words have the same frequency, the lexicographically smaller one
comes first . All words are in lower case in the output ."""
#print(inputStr)
#Split the lines into words and get unique words
wordsList = inputStr.lower().split()
wordsSet = set(wordsList)
#print(wordsSet)
#Now identify the frequency of each word
wordFreq = {}
wordLines = {}
#split into lines
lines = inputStr.lower().splitlines();
noOflines = 0
wordCnt = 0
lineNums = ""
for word in wordsSet:
noOflines = 0
wordCnt = 0
lineNums = ""
for line in lines:
wordCnt = wordCnt + line.lower().count(word.lower()+" ")
noOflines = noOflines+1
if(word in line):
lineNums = lineNums+" "+str(noOflines);
wordFreq[word] = wordCnt
wordLines[word] = lineNums
#print (wordCnt," ",word, " ",lineNums)
#print("No of Lines : ",noOflines)
#sortWordFreq= sorted(wordFreq, key=wordFreq.get)
items = [(v, k) for k, v in wordFreq.items()]
items.sort()
items.reverse() # so largest is first
items = [(k, v) for v, k in items]
for key,value in items:
print(value," ",key, " ",wordLines[key])
#print("Len of set: ",len(wordsSet), " Len of dict: ", len(wordFreq))
#for key in items:
# print(wordFreq[key]," ",key, " ",wordLines[key])
--output---
11 the 1 3 4 5 6 7 8
7 a 1 2 3 4 5 6 7 8
7 . 1 2 3 6 7 8
6 word 1 2 3 5 6 7 8
5 of 1 2 3 5
4 in 1 4 5 6 7 8
3 same 3 7
3 output 4 6 8
2 words 7 8
2 should 3 6
2 sequence 2 3
2 python 1 3
2 line 4 5
2 is 4 5 6
2 frequent 6 7
2 each 1 4
2 case 7 8
2 as 2 4 6 7 8
2 and 3 5
1 you 6
1 write 1
1 word, 5
1 with 4
1 white 2 5
1 we 1 8
1 two 7
1 to 6
1 this 6
1 then 5
1 that 1
1 text 1
1 space, 5
1 smaller 7
1 python) 3
1 program 1
1 one 7
1 numbers 5
1 number 4 5
1 non-white-space 2
1 most 6
1 lower 8
1 list 5
1 lexicographically 7
1 least 6
1 itself, 5
1 indicating 4
1 have 7
1 from 6
1 frequency, 7
1 frequency 5 7
1 frequencies 1
1 formatted 4
1 follows: 4
1 first 8
1 different 2
1 define 1
1 counts 1
1 contiguous 2
1 containing 6
1 considered 3
1 comes 8
1 characters 2
1 character 2 3
1 capitalizations 2
1 begins 4
1 be 3 4 5
1 are 8
1 all 7 8
1 .g 3
1 (e 3
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.