Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Algorithm Tokenize ( string s ): list ( string ) tokens := [ ] while not isEmpty

ID: 3801594 • Letter: A

Question

Algorithm Tokenize(string s):

list(string) tokens := [ ]

while not isEmpty(s)

If s begins with a token, remove the longest possible token from the beginning of s and push that token onto the back of tokens

If head(s) is a whitespace character, pop(s).

return tokens

For example, below is a trace of the call tokenize("hello, world 3+1")

# it

tokens

s

0

[]

"hello, world 3+1"

1

["hello"]

", world 3+1"

2

["hello",","]

" world 3+1"

3

["hello",","]

"world 3+1"

4

["hello",",","world"]

" 3+1"

5

["hello",",","world"]

"3+1"

6

["hello",",","world","3"]

"+1"

7

["hello",",","world","3","+"]

"1"

8

["hello",",","world","3","+","1"]

""

Homework:

Implement Tokenize in Python. That is, write a Python function called tokenize such that if s is a tring for which Tokenize(s) is defined, then Tokenize(s) == tokenize(s).

Test cases:

Note the following

tokenize("hello world")

["hello", "world"]

tokenize("hello'world'foo")

["hello", "'world'", "foo"]

["hello", "'world'", "foo"]

['hello', ''world'', 'foo']

tokenize("'hello\'world'")

["'hello\'world'"]

tokenize("'hello world'")

["'hello world'"]

tokenize("3.33(33..")

not in test suite

tokenize("\")

["\"]

tokenize(" ")

[ ]

tokenize("'a'e'c'")

["'a'", "e", "'c'"]

tokenize("3.3+1")

["3.3","+","1"]

For example, you could enter into repl.it or IDLE,

   >> tokenize("'hello\'world'")== ["'hello\'world'"]

   >> True

Send your tokenizer as an attached .py file

# it

tokens

s

0

[]

"hello, world 3+1"

1

["hello"]

", world 3+1"

2

["hello",","]

" world 3+1"

3

["hello",","]

"world 3+1"

4

["hello",",","world"]

" 3+1"

5

["hello",",","world"]

"3+1"

6

["hello",",","world","3"]

"+1"

7

["hello",",","world","3","+"]

"1"

8

["hello",",","world","3","+","1"]

""

Explanation / Answer

Answer:

Below is the require script in python:

import nltk       #Import the Natural Language Tool Kit
sentence = input("Enter the string that you want to be tokenized")
tokens = nltk.word_tokenize(sentence)
print(tokens)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote