Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

We begin with a feature extraction function. The features we are going to use ar

ID: 3587905 • Letter: W

Question

We begin with a feature extraction function. The features we are going to use are called trigrams. A trigram is simply a string of three contiguous characters. For example in the string "I love computing", there are lots of trigrams ( to be precise, where is the length of the string): ["I l"," lo","lov","ove"] are the first four of them, in sequence.

Write a function count_trigrams(document) that takes a string and returns a default dictionary with the frequency counts of the trigrams within the string (noting that if you have  repeats of the same trigram in the string, the frequency will be ). Note that the output must be a default dictionary and not a standard dictionary, as it will be useful later. Note also that you should not modify the string in any way (e.g. remove punctuation, remove whitespace or convert to lower case) in calculating the frequencies.

Your code should behave as follows:

My thinking:

from collections import defaultdict as dd

def count_trigrams(document):
""" count_trigrams takes a string and returns a dictionary of the counts
of trigrams within the document. """

Explanation / Answer

from collections import defaultdict

def count_trigrams(msg):
    d = defaultdict(float)
    for i in range(len(msg)):
        if i+2 < len(msg):
           s =""
           s = s+msg[i] +msg[i+1]+msg[i+2]
           d[s] = d[s] + 1
              
    return d

print(count_trigrams("hel"))
print(count_trigrams("aaaaa"))
print(count_trigrams("Boaty mcBoatFace"))
  

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote