Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Text Analysis using Java Collections Framework There is some debate on influence

ID: 3835394 • Letter: T

Question

Text Analysis using Java Collections Framework


There is some debate on influence of Jane Austen on Charlotte Bronte work as a writer (and in generalon all three Bronte sisters’). If you are interested in finding more, feel free to google for their works andthe debate.  
For this exercise, using the publicly available books on Project Gutenberg (http://www.gutenberg.org),you are asked to find the top 10 words and number of times they occur in books by Charlotte Bronte butnot used by Jane Austen.
To simply this exercise, we will only use the following books:

Jane Austen

Charlotte Bronte

Pride and Prejudice
(http://www.gutenberg.org/files/1342/1342-0.txt)

Jane Eyre: An Autobiography
(http://www.gutenberg.org/cache/epub/1260/pg1260.txt)

Emma
(http://www.gutenberg.org/files/158/158-0.txt)

Villette
(http://www.gutenberg.org/cache/epub/9182/pg9182.txt)

Sense and Sensibility
(http://www.gutenberg.org/cache/epub/161/pg161.txt)

Shirley
(http://www.gutenberg.org/files/30486/30486-0.txt)

Persuasion
(http://www.gutenberg.org/cache/epub/105/pg105.txt)

The Professor
(http://www.gutenberg.org/files/1028/1028-0.txt)

Mansfield Park
(http://www.gutenberg.org/files/141/141-0.txt)


HashMap (or HashTree) Java Collection will come handy for finding and keeping the count of words. You are only allowed to use Java Collections as described in Chapter 11 of our class textbook.
Hint: refer to Word Count Map case study in Chapter 11.To receive full credit, submit the following:- Java source code file(s)- Nicely formatted report containing a table of top 10 words found in Charlotte Bronte’s books and not
in Jane Austen’s and their counts, and a summary of your insights from this exercise including how this type of analysis can be useful and any suggestions for improvements.

Explanation / Answer

Hi,

Please find below working Code

Please make sure few changes before running code

1) Create input.txt file in D drive with desired one line "http://www.gutenberg.org/files/1342/1342-0.txt"

2) Check Oputput.txt file for output of the first top 10 words with their occurance for a given input file

/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package javaapplication2;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileReader;
import java.io.FileWriter;
import java.util.Comparator;
import java.util.HashMap;
import java.util.Map;
import java.util.SortedSet;
import java.util.TreeMap;
import java.util.TreeSet;

/**
*
* @author Archit
*/
public class JavaApplication2 {
  
    private static final String FILENAME = "http://www.gutenberg.org/files/141/141-0.txt";

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
      
        File outputFile = null;
        BufferedReader bufferReader = null;
        BufferedWriter bufferedWriter = null;
        String[] splitted = null;
        String sCurrentLine = "";
        Map<String, Integer> wordCountMap = null;
        try {
            wordCountMap = new HashMap<String, Integer>();
            bufferReader = new BufferedReader(new FileReader("D:\input.txt"));
          
            while ((sCurrentLine = bufferReader.readLine()) != null) {
              
                splitted = sCurrentLine.split(" ");
                for (int i = 0; i < splitted.length; i++) {
                    if (wordCountMap.containsKey(splitted[i])) {
                        int cont = wordCountMap.get(splitted[i]);
                        wordCountMap.put(splitted[i], cont + 1);
                    } else {
                        wordCountMap.put(splitted[i], 1);
                    }
                }
            }
          
            SortedSet<Integer> values = new TreeSet<Integer>(wordCountMap.values());
            SortedSet<Integer> sortedWordCountTreeMap = new TreeSet<Integer>(new Comparator<Integer>() {
            @Override
            public int compare(Integer o1, Integer o2) {
                return o2.compareTo(o1);
            }
            });
            sortedWordCountTreeMap.addAll(values);
          
            outputFile = new File("D://output.txt");
            if(outputFile.createNewFile())
            {
            bufferedWriter = new BufferedWriter(new FileWriter(outputFile));
            }
            else
            {
                bufferedWriter = new BufferedWriter(new FileWriter(outputFile));
            }
            for (int count = 0; count < 10; ++count) {
                bufferedWriter.write(sortedWordCountTreeMap.iterator().next());
                bufferedWriter.newLine();
            }
          
        } catch (Exception ex) {
            System.out.println("Error while counting word from " + FILENAME);
        } finally {
        }
      
    }
}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote