PYTHON HELP!! :( Create text analyzer that processes text and provides informati
ID: 3763648 • Letter: P
Question
PYTHON HELP!! :(
Create text analyzer that processes text and
provides information about its word contents.
One thing it should be able to support is the ability to
create a report that shows a count of how many times each word
occurs in the text. The report should be sorted, with a
primary sort of word length, and a secondary ASCII sort.
The code should be production quality and written as if
it were part of a utility library.
Extra credit given for solutions that do not rely on Java Collections.
Example:
Input:
The quick brown fox jumped over the lazy brown dog’s back
Output:
1 The
1 fox
1 the
1 back
1 lazy
1 over
2 brown
1 dog’s
1 quick
1 jumped
Currently I have:
from collections import Counter
file = open("text.txt","r")
#read the file (The quick brown fox jumped over the lazy brown dog’s back) & split words
wordcount =Counter(file.read().split())
#printing word count
for item in wordcount.items():
print ("{} {}".format(*item))
#I don't know how to sort by word size, nor how to get rid of the weird symbols next to "dog's" . Please help me :(
Explanation / Answer
1. To sort by word size: You can do the following:
a) First extract the length of each word and store it in some object. For extracting word length, you can use len() function.
Ex.: len("Hello Word")
This will return 11.
b) Then create list of word tuples (mapping of each word with its length). For ex.
word_tuples=[
('The',1),
('quick', 5),
.....
....
]
After this you can used sorted() function to sort the list based on a key. For ex.:
sorted(word_tuples, key = lambda wordlength : wordlength[1])
2. Getting rid of unwanted symbols: To remove leading white spaces, strip() function can be used. To replace a symbol, replace() function can be used. For ex.:
str=" Hello World "
str.strip()
str="@How are you?@"
str.replace("@","")
First will strip leading whitespaces, while second will replace "@" symbol with "".
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.