We begin with a feature extraction function. The features we are going to use ar
ID: 3587905 • Letter: W
Question
We begin with a feature extraction function. The features we are going to use are called trigrams. A trigram is simply a string of three contiguous characters. For example in the string "I love computing", there are lots of trigrams ( to be precise, where is the length of the string): ["I l"," lo","lov","ove"] are the first four of them, in sequence.
Write a function count_trigrams(document) that takes a string and returns a default dictionary with the frequency counts of the trigrams within the string (noting that if you have repeats of the same trigram in the string, the frequency will be ). Note that the output must be a default dictionary and not a standard dictionary, as it will be useful later. Note also that you should not modify the string in any way (e.g. remove punctuation, remove whitespace or convert to lower case) in calculating the frequencies.
Your code should behave as follows:
My thinking:
from collections import defaultdict as dd
def count_trigrams(document):
""" count_trigrams takes a string and returns a dictionary of the counts
of trigrams within the document. """
Explanation / Answer
from collections import defaultdict
def count_trigrams(msg):
d = defaultdict(float)
for i in range(len(msg)):
if i+2 < len(msg):
s =""
s = s+msg[i] +msg[i+1]+msg[i+2]
d[s] = d[s] + 1
return d
print(count_trigrams("hel"))
print(count_trigrams("aaaaa"))
print(count_trigrams("Boaty mcBoatFace"))
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.