Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Data manipulation using pattern matching in R: We have an input dataset (.csv fi

ID: 3836652 • Letter: D

Question

Data manipulation using pattern matching in R: We have an input dataset (.csv file) with only one field which contians job description. job description we have nearly similar description (with little difference) or miss-spelling in the dataset entries. Our goal is to compare those nearly similar job description with a pattern and replace them to a single/uniform kind of job description in the input dataset. Below is one detail explanation with example on how the code should work: Varied job description entires (total 125 rows) in input file but all of them actually should be 'manager':- manager (say occurs 100 times) management occurs 10 times) manager-in-training (occurs 5 times) manager-pmac (occurs 3 times) manager) (occurs 4 times) managet (occurs 3 times) steps Read the .csv input file for the job description column. Create a look-up table/file which will contain the pattern we will try to match with the dataset job description entires. eg Our pattern can b

Explanation / Answer

//Here I am giving the sample code for the given problem


//Taking input as file

//Read the file

//Do the pattern matching with file

//Generate the output

import java.util.regex.*;

public class patternMtachingFile {

    public static void main(String[] args) {

        String input = "data.*";

        try {

            FileInputStream fstream = new FileInputStream("thomas.txt");

            DataInputStream in = new DataInputStream(fstream);

            BufferedReader br = new BufferedReader(new InputStreamReader(in));

            String dataLine;
            while ((strLine = br.readLine()) != null) {

                if (Pattern.matches(input, dataLine)) {

                    Pattern p = Pattern.compile("'(.*?)'");


Matcher mt = Pattern.compile("(]+)\([^\) (?m)^\s*([^\]*\|<([^>]*)>[^\)]*\)").matcher(dataLine);


                    while (mt.search()) {

                        String x = mt.group(1);

                        String y = x.toString() + ".*";

                        System.out.println(b);


                        if (Pattern.matches(c, dataLine)) {

                            Pattern ptrn = Pattern.compile("<(.*?)>");

                            Matcher match = ptrn.matcher(dataLine);

                            while (match.find()) {

                                System.out.println(mt.group(1));


                            }

                        } else {

                            System.out.println("There is no matching with specified string file input");

                        }

                    }

                }

            }

        } catch (Exception e) {

            System.err.println("e: " + e.getMessage());

        }

    }

}