Develop a program that will find out how many uniquewords appear in the text of
ID: 3615513 • Letter: D
Question
Develop a program that will find out how many uniquewords appear in the text of a book, and for each word, find out howmany times it is used. The words will be located in a textfile. The user will specify the name of the text file, or youmay allow the name to be a command-line argument. If the file failsto open, enter into dialog with the user to get the name of a filethat can be opened. Ignore all punctuation in thetext file not associated with a word. That is, ignore periods,commas, semicolons, colons, various parentheses, and exclamationpoints, as well as punctuation I haven't thought to mention. Do keep the apostrophe in can't. Treat words with apostrophes asdifferent unique words so that "street" is different from"street's"; also, treat plurals as different from their singularsand the various forms of verbs as different — this is a hashtable exercise, not a natural language parsing exercise.You may also treat numbers as words.- Total number of collisions on insertions. A collision occurs when a word hashes to a location alreadyoccupied by another word. Hint: this means that theHashTable object requires a field to accumulate the totalcollisions when a word has not yet been entered into thetable.
- Average number of collisions per wordplacement. This is just the total number of collisionsdivided by the words placed into the hash table. Hint:this means that the HashTable object also requires a field toaccumulate the total words in the table.
Any help or ideas or references to helpful resources will be veryappreciated. lifesaver right away
Explanation / Answer
Hi. Interesting problem. You didn't mention what programminglanguage you are using, but I can give you a general idea on whereto start and how to solve the problem: By getting the argument that contains the file name fromthe command line as a string, you should be call a funtion to openthat file with the name. If the function returnsan error, then print a message to the user stating the file couldnot be opened, else, continue to the nextstep. To parse the text file can vary based on the programminglanguage you are using, but try to look for spaces. The lettersthat you store between spaces make up a word. Because punctuationis ignored, you will have to store each character at a timechecking to see if the character is a valid character (ie. notpunctuation). A Hash table basically is a list of keys and each key pointsto information. So as you are parsing the text file that you haveopened, I would store the word as a key in the hash and then theinformation that key points to is the number of times that word hasappeared. The trick is to first check if you have created the samekey before creating a new one. Depending on the programminglanguage you are using, there could be an easy way to checkif the key (word) exists in the hash table, add 1to the counter it is pointing at, else, create anew key in the hash using the word found. example pseudo code: Text: "Programming can be fun and can be challenging." word = Programming if (searchHashTable(word)==true) //found word in hashtable { hashTable(word).counter++ //thecounter variable for that key can be accessed differently based onthe prg lang. } elseif (searchHashTable(word) == false) { new hashTable.key = word } do the above over again until the file ends. Now that you have all your data stored in the hash table, youjust have to display that information in your output file. You know that all the keys in the hash table are unique so thenumber of keys means the number of unique words. Each key(word) hasa counter that tells you how many times that word appeared. Add upall the counters for each key(word) and you will have the totalnumber of words. Extract the information like so and display it asrequired in your output file. Remember to close any files that you have opened once you arefinished with them. I hope that helps.Depending on the programming language you are using, there arequite a few textbooks out there that explain how to programspecific things like reading and writing to files and how to usehash tables. Good Luck!!
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.