java: Problem Description: Using ArrayLIst [option I] or Array [option II], you
ID: 3772197 • Letter: J
Question
java: Problem Description: Using ArrayLIst [option I] or Array [option II], you will learn how the MapReduce algorithm works as a data analytics algorithm in most of Big-data analysis problems.
To simplify the question, I will let you re-format the input file with meaningful numbers of 40 lines only [option II].
The Big-data analysis procedure includes (1) reading, (2) decomposing, (3) mapping, (4) reorganizing or shuffling, and (5) reducing processes (20pts).
Please present the program solutions to the following questions.
First of all, you are supposed to write (1) TemperatureLog class for defining an object which holds the extracted data from the sequence of rows of the given "data.txt" file and (2) TestTemperatureMain class as a driver program to solve the following subsequent questions which need to be written by the associated methods.
// Class TemperatureLog
public class TemperatureLog {
private String aRow;
private int ID;
private int year
private double tempDbl; // temperature value typed
private int tempInt; // either double or int
private String sensorCode;
public TemperatureLog() { // constructor 1
aRow = "";
ID = 0;
year = 0;
tempDbl = 0.0;
tempInt = 0;
sensorCode = "";
}
public TemperatureLog(String line) { // constructor 2
aRow = line;
ID = 0;
year = 0;
tempDbl = 0.0;
tempInt = 0;
sensorCode = "";
}
// Getter methods
...
// Setter methods
...
} // end of class TemperatureLog
(1) Reading : Given "data.txt", please write a program, TestTemperatureMain.java, which reads 40 lines of data and buildArrayList of TemperatureLog objects [option I] or Array of 40 TemperatureLog objects [option II] from the individual lines. The class TemperatureLog is to be defined and used to print out the lines of string on the console (System.out) (4pts): complete the following TestTemperatureMain.java.
Remark: for the better performance, you may declare intermediate classes in addition to the given two classes.
// Class TestTemperatureMain
public class TestTemperatureMain {
public static void main(String[] args) throws FileNOtFoundException
{
ArrayList tempList = new ArrayList();
Scanner input = new Scanner(new File("data.txt"));
// 1. Read data.txt file and build ArrayList of TemperatureLog for each line
read2BuildArrayList(input, tempList);
// 2. Decompose each row into separate fields
decomposeArrayList(tempList);
// 3. Map each row by extracting year and temperature
// into ArrayList (or Array) of (year, temperature)
// pair. Or you just need to keep the existing
// ArrayList and print out (year, temperature) pair
// on the console.
// Note: you can choose one of the following
map(tempList, mappedList);
map(tempList);
// 4. Reorganize each mapped pair by merging them into
// the reorganized pair:(year, [array of temperature values of the year]),
// which merges the temperature values of the same year into
// the array of the temperature values.
// You might need to define new class used as reorganizedList.
reorganize(tempList, reorganizedList);
// 5. Reduce the above reorganizedList into the short(reduced) List.
// This is supposed to print out the list of
// (year, highest_temperature_value_of_the_year)
reduce(reorganizedList);
} // end of main()
public static void read2BuildArrayList( ... ) {
} // end of read2BuildArrayList()
public static void decomposeArrayList( ... ) {
} // end of decomposeArrayList()
public static void map( ... ) {
} // end of map()
public static void reorganize( ... ) {
} // end of reorganize()
public static void reduce( ... ) {
} // end of reduce()
} // end of class TestTemperatureMain
Please refer to the following descriptions on each step.
2) Decomposing : After then, you are supposed to decompose each line into the ArrayList of the following structure. (4pts)
- Encoding rule for each line:
1st int : ID
2nd String :
0000 : meaningless 4-digit
____ : Year (4-digit)
A001 : Sensor code (until the end of the String)
3rd String :
'+' or '-' : positive or negative
digits (prior to the next '+' symbol) : temperature (Fahrenheit degree)
following +999999 : meaningless
- Analyzed data within ArrayList of the TemperatureLog objects
You will generate another ArrayList which contains objects typed class TemperatureLog as follows:
public class TemperatureLog {
private String aRow;
private int ID;
private int year
private double tempDbl; // temperature value typed
private int tempInt; // either double or int
private String sensorCode;
...
}
In the above driver class, TestTemperatureMain, you are supposed to create the following ArrayList of TemperatureLogclass .
...
ArrayList tempList = new Arraylist();
...
[Option I] Original Input Text File:
"data.txt"
1 00001950A01 +0011+999999
2 00001950A012 +0022+999999
3 00001950A02 +0065+999999
4 00001950A039 +0103+999999
5 00001950B001 +0099+999999
6 00001950B026 +0054+999999
7 00001950C006 +0109+999999
8 00001950D01 +0085+999999
9 00001950D30 +0072+999999
10 00001950E03 +0120+999999
11 00001951A01 +0026+999999
12 00001951A012 +0035+999999
13 00001951A02 +0059+999999
14 00001951A039 +0110+999999
15 00001951B001 +0103+999999
16 00001951B026 +0049+999999
17 00001951C006 +0099+999999
18 00001951D01 +0091+999999
19 00001951D30 +0085+999999
20 00001951E03 +0117+999999
21 00001953A01 +0026+999999
22 00001953A012 +0041+999999
23 00001953A02 +0069+999999
24 00001953A039 +0110+999999
25 00001953B001 +0100+999999
26 00001953B026 +0072+999999
27 00001953C006 +0087+999999
28 00001953D01 +0102+999999
29 00001953D30 +0095+999999
30 00001953E03 +0102+999999
31 00001954A01 +0033+999999
32 00001954A012 +0046+999999
33 00001954A02 +0057+999999
34 00001954A039 +0106+999999
35 00001954B001 +0119+999999
36 00001954B026 +0093+999999
37 00001954C006 +0057+999999
38 00001954D01 +0089+999999
39 00001954D30 +0088+999999
40 00001954E03 +0092+999999
[Option II] Simplified Input File:
"data2.txt"
1950 1
1950 22
1950 65
1950 103
1950 99
1950 54
1950 109
1950 85
1950 72
1950 120
1951 26
1951 35
1951 59
1951 110
1951 103
1951 49
1951 99
1951 91
1951 85
1951 117
1953 26
1953 41
1953 69
1953 110
1953 100
1953 72
1953 87
1953 102
1953 95
1953 102
1954 33
1954 46
1954 57
1954 106
1954 119
1954 93
1954 57
1954 89
1954 88
1954 92
3) Mapping : You are supposed to map the data as follows: (4pts)
(1950, 11)
(1950, 22)
(1950, 65)
(1950, 103)
(1950, 99)
(1950, 54)
(1950, 109)
(1950, 85)
(1950, 72)
(1950, 120)
(1951, 26)
(1951, 35)
(1951, 59)
(1951, 110)
(1951, 103)
(1951, 49)
(1951, 99)
(1951, 91)
(1951, 85)
(1951, 117)
(1953, 26)
(1953, 41)
(1953, 69)
(1953, 110)
(1953, 100)
(1953, 72)
(1953, 87)
(1953, 102)
(1953, 95)
(1953, 102)
(1954, 33)
(1954, 46)
(1954, 57)
(1954, 106)
(1954, 119)
(1954, 93)
(1954, 57)
(1954, 89)
(1954, 88)
(1954, 92)
The above list is to be from the intermediate (internal) ArrayList of YearTemp class.
4) Reorganizing (or Shuffling) : The above list is to be reorganized and reduced as follows: (4pts)
You need to create a method which reads the list from question 3) and reorganizes them into the above form. A new class AllInOnemight defined to handle this function.
Reorganizing:
(1950, [11, 22, 65, 103, 99, 54, 109, 85, 72, 120])
(1951, [26, 35, 59, 110, 103, 49, 99, 91, 85, 117])
(1953, [26, 41, 69, 110, 100, 72, 87, 102, 95, 102])
(1954, [33, 46, 57, 106, 119, 93, 57, 89, 88, 92])
5) Reducing : The above list is to be reduced as follows: (4pts)
Finally the above 4 lines of intermediate data should be reduced by picking up the highest temperature among the contained data.
Reducing:
(1950, 120)
(1951, 117)
(1953, 110)
(1954, 119)
id year sensor code temperature(F) 1 1950 A01 11 2 1950 A012 22 3 1950 A02 65 4 1950 A039 103 ... ... ... ...Explanation / Answer
mport java.util.;
import java.io.;
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.