The following code counts how often different words occur in a text file. char t
ID: 3719469 • Letter: T
Question
The following code counts how often different words occur in a text file.
char toLC (char c) {
return (c >= 'A' && c <= 'Z') ? c-'A'+'a' : c;
}
void countWordsInFile (istream& inFile, map& counts) {
string word;
while (inFile >> word) { // strip away any non-alphabetic characters
string::size_type n = word.find_first_not_of (wordCharacters);
if (n != string::npos) {
word = word.substr (0, n);
if (word.length() > 0) {
// if there's anything left, count it
word = transform(word.begin(), word.end(), word.begin(), toLC);
/** ... increment the appropriate counter in counts ... **/
map::iterator pos = counts.find (word);
if (pos == counts.end()) { // if this is the 1st time we've seen // this word
counts.insert (map::value_type(word, 1)); }
else { // else if we've seen this word before {
++((*pos).second); }
}
}
} Let W denote the number of words in being processed. Many of these words, in a typical document, will be duplicates. Let D denote the number of distinct (non-duplicate) words, D <= W. Assume the length of any individual word is bounded by some constant (e.g., no single word is longer than "antidisestablishmentarianism"). What is the average-case complexity for countWordsInFile? (examle O(n), O(logn), O(nlogn), etc) PLEASE SHOW ALL WORK ON DETERMINING COMPLEXITY
Explanation / Answer
First, we are trying to find the number of Strings that only appear exactly once in a file.
Once we read from a file, we are storing the words in two HashSet Objects.
So, there is no need for sorting also. A better approach would be to use two HashSet One for maintaining repeating and one for non-repeating words. Since HashSet internally uses HashMap, Ideally contains, get(), put() method has o(1) complexity. Hence, the overall complexity of this approach would be o(n).
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.