PYTHON I am working on next problem. Consider the following sentences written in
ID: 3859286 • Letter: P
Question
PYTHON
I am working on next problem.
Consider the following sentences written in Klingon. For each sentence, the part of speech of each “word” has been given (for ease of translation, some prefixes/suffixes have been treated as words), along with a translation. Using these training sentences, we’re going tobuild a Hidden Markov Model(HMM)to predict the part of speech of an unknown sentence using the Viterbi algorithm.
N PRO V N PRO
pa’Daq ghah taH tera’ngan ’e
room (inside) he is human of
The human is in the room
V N V N
ja’chuqmeH rojHom neH tera’ngan
in orderto parley truce want human
The enemy commander wants a truce in order to parley
N V N CONJ N V N
tera’ ngan qIp puq ’eg puq qIp tera’ngan
human bit child and child bit child
The child bit the human, and the human bit the child
Step 1: Creating the Emission probability table(emission.javaor emission.py)Create a Emission probability table by computingthe frequencies of each part of speech in thetable below for all POS tags. We’ll use a smoothing factor of 0.1 (as discussed in class) to make sure that no event is impossible; add this number to all of your observations. Sample table valuesof two parts of speechhave been shown.Probability(word|tag) = Count(word,tag) / Count(tag)
and here is what I got for this part:
words1 = "pa’Daq ghah taH tera’ngan ’e".replace("’","'").split()
tags1 = "N PRO V N PRO".split()
words2 = "ja’chuqmeH rojHom neH tera’ngan".replace("’","'").split()
tags2 = "V N V N".split()
words3 = "tera’ngan qIp puq ’eg puq qIp tera’ngan".replace("’","'").split()
tags3 = "N V N CONJ N V N".split()
train = []
train.append(zip(words1, tags1))
train.append(zip(words2, tags2))
train.append(zip(words3, tags3))
from collections import defaultdict
new_dict = defaultdict(list)
#print(new_dict)
for sent in train:
for word, tag in sent:
new_dict[word].append(tag)
#print(new_dict)
for word, tags in sorted(new_dict.items()):
row = []
row.append(word)
#print(row)
for tag in ["N", "V", "CONJ", "PRO"]:
row.append(tags.count(tag)+0.1)
There is next step: Creating the Transition probability table (transition.py) Generate a transition probability table by calculating the transition frequencies from one POS tag to another. Now, for each part of speech, total the number of times it transitioned to each other part of speech. Again, use a smoothing factor of 0.1. After you’ve done this, compute the start and transition probabilities. Sample table values of transition for two parts of speech have been shown.
Probability(tagi|tagi-1) = Count(tagi-1, tagi) / Count(tagi-1)
May someone help me here?
Explanation / Answer
egin{tabular}{llllll} N & PRO & V & N & PRO \ pa'Daq & ghah & taH & tera'ngan & 'e \ room (inside) & he & is & human & of \ multicolumn{6}{c}{{em The human is in the room}} \ end{tabular} egin{tabular}{llll} V & N & V & N \ ja'chuqmeH & rojHom & neH & tera'ngan \ in order to parley & truce & want & human \ multicolumn{4}{c}{{em The enemy commander wants a truce in order to parley}} end{tabular} egin{tabular}{lllllll} N & V & N & CONJ & N & V & N \ tera'ngan & qIp & puq & 'eg & puq & qIp & tera'ngan \ human & bit & child & and & child & bit & child \ multicolumn{7}{c}{{em The child bit the human, and the human bit the child}} end{tabular}
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.