Natural language processing using Python NLTK package, WILL RATE ASAP E-book: ht

ID: 672205 • Letter: N

Question

Natural language processing using Python NLTK package, WILL RATE ASAP

E-book: http://victoria.lviv.ua/html/fl5/NaturalLanguageProcessingWithPython.pdf

Question:

Complete the following subparts, in the order given:

1. Consider the sequence of words: Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. This is a
grammatically correct sentence, as explained at http://en.wikipedia.org/wiki/
Buffalo_buffalo_Buffalo_buffalo_buffalo_buffalo_Buffalo_buffalo . Consider the tree diagram
presented on this Wikipedia page, and write down a suitable grammar. Normalize case to lowercase, to
simulate the problem that a listener has when hearing this sentence. Can you find other parses for this
sentence? How does the number of parse trees grow as the sentence gets longer? (More examples of these
sentences can be found at http://en.wikipedia.org/wiki/List_of_homophonous_phrases) .

2. You can modify the grammar in the recursive descent parser demo by selecting Edit Grammar in the Edit
menu. Change the first expansion production, namely NP -> Det N PP, to NP -> NP PP. Using the Step
button, try to build a parse tree. Describe what happens.

Explanation / Answer

def init_wfst(tokens, grammar): numtokens = len(tokens) wfst = [[None for i in range(numtokens+1)] for j in range(numtokens+1)] for i in range(numtokens): productions = grammar.productions(rhs=tokens[i]) wfst[i][i+1] = productions[0].lhs() return wfst def complete_wfst(wfst, tokens, grammar, trace=False): index = dict((p.rhs(), p.lhs()) for p in grammar.productions()) numtokens = len(tokens) for span in range(2, numtokens+1): for start in range(numtokens+1-span): end = start + span for mid in range(start+1, end): nt1, nt2 = wfst[start][mid], wfst[mid][end] if nt1 and nt2 and (nt1,nt2) in index: wfst[start][end] = index[(nt1,nt2)] if trace: print("[%s] %3s [%s] %3s [%s] ==> [%s] %3s [%s]" % (start, nt1, mid, nt2, end, start, index[(nt1,nt2)], end)) return wfst def display(wfst, tokens): print(' WFST ' + ' '.join(("%-4d" % i) for i in range(1, len(wfst)))) for i in range(len(wfst)-1): print("%d " % i, end=" ") for j in range(1, len(wfst)): print("%-4s" % (wfst[i][j] or '.'), end=" ") print() >>> tokens = "I shot an elephant in my pajamas".split() >>> wfst0 = init_wfst(tokens, groucho_grammar) >>> display(wfst0, tokens) WFST 1 2 3 4 5 6 7 0 NP . . . . . . 1 . V . . . . . 2 . . Det . . . . 3 . . . N . . . 4 . . . . P . . 5 . . . . . Det . 6 . . . . . . N >>> wfst1 = complete_wfst(wfst0, tokens, groucho_grammar) >>> display(wfst1, tokens) WFST 1 2 3 4 5 6 7 0 NP . . S . . S 1 . V . VP . . VP 2 . . Det NP . . . 3 . . . N . . . 4 . . . . P . PP 5 . . . . . Det NP 6 . . . . . . N

Navigate

Natural language processing is a subfield of computer science focusing on how co

Natural language processing using Python NLTK package, WILL RATE ASAP E-book: ht

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Natural language processing using Python NLTK package, WILL RATE ASAP E-book: ht

Question

Explanation / Answer

Related Questions

Navigate