Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

java: Problem Description: Using ArrayLIst [option I] or Array [option II], you

ID: 3772197 • Letter: J

Question

java: Problem Description: Using ArrayLIst [option I] or Array [option II], you will learn how the MapReduce algorithm works as a data analytics algorithm in most of Big-data analysis problems.

To simplify the question, I will let you re-format the input file with meaningful numbers of 40 lines only [option II].

The Big-data analysis procedure includes (1) reading, (2) decomposing, (3) mapping, (4) reorganizing or shuffling, and (5) reducing processes (20pts).

Please present the program solutions to the following questions.

First of all, you are supposed to write (1) TemperatureLog class for defining an object which holds the extracted data from the sequence of rows of the given "data.txt" file and (2) TestTemperatureMain class as a driver program to solve the following subsequent questions which need to be written by the associated methods.

// Class TemperatureLog

public class TemperatureLog {

   private String aRow;
private int ID;
private int year
private double tempDbl; // temperature value typed
private int tempInt;       // either double or int
private String sensorCode;

  

   public TemperatureLog() { // constructor 1

    aRow = "";
    ID = 0;
    year = 0;
    tempDbl = 0.0;
    tempInt = 0;
    sensorCode = "";

   }

   public TemperatureLog(String line) { // constructor 2

    aRow = line;
    ID = 0;
    year = 0;
    tempDbl = 0.0;
    tempInt = 0;
    sensorCode = "";

   }

// Getter methods

   ...

// Setter methods

   ...

} // end of class TemperatureLog

(1) Reading : Given "data.txt", please write a program,  TestTemperatureMain.java, which reads 40 lines of data and buildArrayList of TemperatureLog objects [option I] or Array of 40 TemperatureLog objects [option II] from the individual lines. The class TemperatureLog is to be defined and used to print out the lines of string on the console (System.out) (4pts): complete the following TestTemperatureMain.java.

Remark: for the better performance, you may declare intermediate classes in addition to the given two classes.

// Class TestTemperatureMain

public class TestTemperatureMain {

   public static void main(String[] args) throws FileNOtFoundException

   {

    ArrayList tempList = new ArrayList();

    Scanner input = new Scanner(new File("data.txt"));

   // 1. Read data.txt file and build ArrayList of TemperatureLog for each line

    read2BuildArrayList(input, tempList);

    // 2. Decompose each row into separate fields

    decomposeArrayList(tempList);

    // 3. Map each row by extracting year and temperature

    //    into ArrayList (or Array) of (year, temperature)

    //    pair. Or you just need to keep the existing

    //    ArrayList and print out (year, temperature) pair

    //    on the console.

    //    Note: you can choose one of the following

    map(tempList, mappedList);

   map(tempList);

    // 4. Reorganize each mapped pair by merging them into

    //    the reorganized pair:(year, [array of temperature values of the year]),

    //    which merges the temperature values of the same year into

    //    the array of the temperature values.

    //    You might need to define new class used as reorganizedList.

    reorganize(tempList, reorganizedList);

   // 5. Reduce the above reorganizedList into the short(reduced) List.

   //    This is supposed to print out the list of

    //    (year, highest_temperature_value_of_the_year)

    reduce(reorganizedList);

   } // end of main()

   public static void read2BuildArrayList( ... ) {

   } // end of read2BuildArrayList()

   public static void decomposeArrayList( ... ) {

   } // end of decomposeArrayList()

   public static void map( ... ) {

   } // end of map()

   public static void reorganize( ... ) {

   } // end of reorganize()

   public static void reduce( ... ) {

   } // end of reduce()

} // end of class TestTemperatureMain

Please refer to the following descriptions on each step.

2) Decomposing : After then, you are supposed to decompose each line into the ArrayList of the following structure. (4pts)

- Encoding rule for each line:

     1st int : ID

     2nd String :  

           0000 : meaningless 4-digit

           ____ : Year (4-digit)

          A001 : Sensor code (until the end of the String)

    3rd String :

          '+' or '-' : positive or negative

         digits (prior to the next '+' symbol) : temperature (Fahrenheit degree)

         following +999999 : meaningless

- Analyzed data within ArrayList of the TemperatureLog objects

You will generate another ArrayList which contains objects typed class TemperatureLog as follows:

public class TemperatureLog {

  private String aRow;
private int ID;
private int year
private double tempDbl;    // temperature value typed
private int tempInt;       // either double or int
private String sensorCode;

...

}

In the above driver class, TestTemperatureMain, you are supposed to create the following ArrayList of TemperatureLogclass .

...

ArrayList tempList = new Arraylist();

...

[Option I] Original Input Text File:

"data.txt"

1 00001950A01 +0011+999999

2 00001950A012 +0022+999999

3 00001950A02  +0065+999999

4 00001950A039 +0103+999999

5 00001950B001  +0099+999999

6 00001950B026  +0054+999999

7 00001950C006 +0109+999999

8 00001950D01 +0085+999999

9 00001950D30 +0072+999999

10 00001950E03  +0120+999999

11 00001951A01 +0026+999999

12 00001951A012 +0035+999999

13 00001951A02  +0059+999999

14 00001951A039 +0110+999999

15 00001951B001  +0103+999999

16 00001951B026  +0049+999999

17 00001951C006  +0099+999999

18 00001951D01 +0091+999999

19 00001951D30 +0085+999999

20 00001951E03 +0117+999999

21 00001953A01 +0026+999999

22 00001953A012 +0041+999999

23 00001953A02  +0069+999999

24 00001953A039 +0110+999999

25 00001953B001  +0100+999999

26 00001953B026 +0072+999999

27 00001953C006 +0087+999999

28 00001953D01 +0102+999999

29 00001953D30 +0095+999999

30 00001953E03 +0102+999999

31 00001954A01 +0033+999999

32 00001954A012 +0046+999999

33 00001954A02  +0057+999999

34 00001954A039 +0106+999999

35 00001954B001  +0119+999999

36 00001954B026  +0093+999999

37 00001954C006  +0057+999999

38 00001954D01 +0089+999999

39 00001954D30  +0088+999999

40 00001954E03  +0092+999999

[Option II] Simplified Input File:

     

"data2.txt"

1950 1

1950 22

1950 65

1950 103

1950 99

1950 54

1950 109

1950 85

1950 72

1950  120

1951 26

1951 35

1951  59

1951 110

1951  103

1951  49

1951  99

1951 91

1951 85

1951  117

1953 26

1953 41

1953  69

1953 110

1953  100

1953  72

1953 87

1953 102

1953 95

1953  102

1954 33

1954 46

1954 57

1954 106

1954  119

1954  93

1954  57

1954 89

1954  88

1954  92

3) Mapping : You are supposed to map the data as follows: (4pts)

(1950, 11)

(1950, 22)

(1950, 65)

(1950, 103)

(1950, 99)

(1950, 54)

(1950, 109)

(1950, 85)

(1950, 72)

(1950, 120)

(1951, 26)

(1951, 35)

(1951, 59)

(1951, 110)

(1951, 103)

(1951, 49)

(1951, 99)

(1951, 91)

(1951, 85)

(1951, 117)

(1953, 26)

(1953, 41)

(1953, 69)

(1953, 110)

(1953, 100)

(1953, 72)

(1953, 87)

(1953, 102)

(1953, 95)

(1953, 102)

(1954, 33)

(1954, 46)

(1954, 57)

(1954, 106)

(1954, 119)

(1954, 93)

(1954, 57)

(1954, 89)

(1954, 88)

(1954, 92)

The above list is to be from the intermediate (internal) ArrayList of YearTemp class.

4) Reorganizing (or Shuffling) : The above list is to be reorganized and reduced as follows: (4pts)

You need to create a method which reads the list from question 3) and reorganizes them into the above form. A new class AllInOnemight defined to handle this function.

Reorganizing:

(1950, [11, 22, 65, 103, 99, 54, 109, 85, 72, 120])

(1951, [26, 35, 59, 110, 103, 49, 99, 91, 85, 117])

(1953, [26, 41, 69, 110, 100, 72, 87, 102, 95, 102])

(1954, [33, 46, 57, 106, 119, 93, 57, 89, 88, 92])

5) Reducing : The above list is to be reduced as follows: (4pts)

Finally the above 4 lines of intermediate data should be reduced by picking up the highest temperature among the contained data.

Reducing:

(1950, 120)

(1951, 117)

(1953, 110)

(1954, 119)

id year sensor code temperature(F) 1 1950 A01 11 2 1950 A012 22 3 1950 A02 65 4 1950 A039 103 ... ... ... ...

Explanation / Answer

mport java.util.;
import java.io.
;