Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Data manipulation using pattern matching in R: We have an input dataset (.csv fi

ID: 3836676 • Letter: D

Question

Data manipulation using pattern matching in R: We have an input dataset (.csv file) with only one field which contians job description. For job description we have nearly similar description(with little difference) or miss-spelling in the dataset entries. Our goal is to compare those nearly similar job description with a pattern and replace them to a single/uniform kind of job description in the input dataset. Below is one detail explanation with example on how the code should work: Varied job description entires (total 125 rows) in input file but all of them actually should be 'manager':- manager (say occurs 100 times) management (occurs 10 times) manager-in-training (occurs 5 times) manager-pmac (occurs 3 times) manager) (occurs 4 times) managet (occurs 3 times) steps Read the .csv input file for the job description column. create a look-up table/file which will contain the pattern we will try to match with the dataset job description entires. eg our patten can

Explanation / Answer

//Here I am giving the sample code for the given problem

//Taking input as file
//Read the file
//Do the pattern matching with file
//Generate the output


import java.util.regex.*;
public class patternMtachingFile {
    public static void main(String[] args) {
        String input = "data.*";
        try {
            FileInputStream fstream = new FileInputStream("thomas.txt");
            DataInputStream in = new DataInputStream(fstream);
            BufferedReader br = new BufferedReader(new InputStreamReader(in));
            String dataLine;

            while ((strLine = br.readLine()) != null) {
                if (Pattern.matches(input, dataLine)) {
                    Pattern p = Pattern.compile("'(.*?)'");

Matcher mt = Pattern.compile("(]+)\([^\) (?m)^\s*([^\]*\|<([^>]*)>[^\)]*\)").matcher(dataLine);

                    while (mt.search()) {
                        String x = mt.group(1);
                        String y = x.toString() + ".*";
                        System.out.println(b);

                        if (Pattern.matches(c, dataLine)) {
                            Pattern ptrn = Pattern.compile("<(.*?)>");
                            Matcher match = ptrn.matcher(dataLine);
                            while (match.find()) {
                                System.out.println(mt.group(1));

                            }
                        } else {
                            System.out.println("There is no matching with specified string file input");
                        }
                    }
                }
            }
        } catch (Exception e) {
            System.err.println("e: " + e.getMessage());
        }
    }
}