Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Finding duplicate data in a sorted file is the first step to removing duplicate

ID: 3624498 • Letter: F

Question

Finding duplicate data in a sorted file is the first step to removing duplicate data. Given an input file with each line representing a record of data and the first token (word) being the key that the file is sorted on, we want to read the file and output the line number and record for any duplicate keys we encounter. Remember we are assuming the file is sorted by the key and we want to output to the screen
the records (and line numbers) with duplicate keys. Download these 2 test files (create the other test files yourself). Note: the input text files must be in the same directory as your program.

http://www.cs.iit.edu/~cs201/labs/Lab9/input1.txt

http://www.cs.iit.edu/~cs201/labs/Lab9/input2.txt

Create a FindDuplicates class with the following:
Declaration of an instance variables for the String filename
non-default Constructor - creates an object for user passed arguments of filename
Accessor - returns the value of each instance variable
Mutator - allows the user to set each instance variable (no validation required)
a "getDuplicates()" method that reads from the file (until end-of-file) and finds duplicate records
based on the first token on each line (the key), and returns as a String the record number and entire duplicate record one to a line (see above)
toString() - returns a String message with the value of the instance variable
Create a FindDuplicatesApp driver program to test your FindDuplicates class.

Here is my FindDuplicatesClass. I am stuck at the getDuplicates() method
public class FindDuplicates
{
private String fileName;

public FindDuplicates()
{
setFileName(newFileName);
}

public String getFileName()
{
return fileName;
}

public void setFileName(String newFileName)
{
if (fileName != newFileName)
fileName = newFileName;
}

public int getDuplicates()
{
File duplicateDoc = new File(fileName);
Scanner input = new Scanner(duplicateDoc);
int number;
int count;
while(input.hasNext() && input.hasNextLine())
{
String line = null;
line = input.nextLine();
if (line != null)
{

}
}
}
I know I havent done the main thing here. I dont really understand the question to start with. Pls help and also correct my code if necessary

Explanation / Answer

FindDuplicates.java

import java.io.*;
import java.util.*;

public class FindDuplicates {
    private String fileName;

    public FindDuplicates(String newFileName)
    {
        setFileName(newFileName);
    }

    public String getFileName()
    {
        return fileName;
    }

    public void setFileName(String newFileName)
    {
        fileName = newFileName;
    }

    public String getDuplicates()
    {
        File duplicateDoc = new File(fileName);
        Scanner input;
        try {
            input = new Scanner(duplicateDoc);
        } catch (FileNotFoundException e) {
            System.out.println(e);
            return "";
        }
       
        int count=1; // This will keep track of the record / line number
       
        String key = "";
        String line = "";
        String duplicates = "";
        while(input.hasNext() && input.hasNextLine())
        {
            String lastKey = key;
            String lastLine = line;
            key = input.next();
            line = input.nextLine();
            count++;
            if (key.equals(lastKey)) {
                // We found a duplicate
                duplicates = duplicates + (count-1) + " " + lastKey+" "+lastLine+" ";
                duplicates = duplicates + count + " " + key + " " + line + " ";
                // See if there's more duplicates of this record
                while (input.hasNext() && input.hasNextLine()) {
                    key = input.next();
                    line = input.nextLine();
                    count++;
                    if (key.equals(lastKey)) {
                        // We found another duplicate
                        duplicates = duplicates + count+ " " + key + " "+line+" ";
                    } else {
                        // We've come to the end of the duplicates
                        // We're on a new key now.
                        break;
                    }
                }
            }
                       
        }
        return duplicates;
       
    }
}

------------

FindDuplicatesApp.java

public class FindDuplicatesApp {

    /**
    * @param args
    */
    public static void main(String[] args) {
        String fileName1 = "input1.txt";
        String fileName2 = "input2.txt";
        FindDuplicates fd = new FindDuplicates(fileName1);
        System.out.print(fd.getDuplicates());
        fd.setFileName(fileName2);
        System.out.println();
        System.out.println();
        System.out.print(fd.getDuplicates());
    }

}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote