This exercise is very representative of the sort of data processing conducted in
ID: 3794244 • Letter: T
Question
This exercise is very representative of the sort of data processing conducted in organizations. It consists of writing a small application which extracts some basic statistics from a commonly used data source. We could, of course, use tools such as Excel or a statistics package for this job, but this type of programming —extracting information from raw data sets— is so common that we want to practice some of it (not to mention that it furthers our programming skills).
In this exercise we work with arrays, but we also build our own data structure aka a class and we implement our program as a Windows Forms application.
Consider the 2016 Jan. 15, Jan. 16 and Jan. 17 log files from the TeachEngineering.org Apache Web server. The logs represent HTTP requests made to the server on those days. Requests are represented with the following format:
server-idrequester-ip--[date time of the request]"the request"HTTP-return-codenumber-of-bytes-servedsome-more-stuff
Your Assignment
Write a Windows Forms-based program which allows a user to select a log file and which then finds the 100 top-hitting IPs in that log file; i.e., the 100 IPs which issued the most requests, not counting IP '140.211.167.204'
The program lists these top-100 IPs in descending order of number of requests made (largest hitter on top), followed by their total number of requests; one IP per line (see examples below).
At the end (bottom) of the list, the program prints the following (see examples below):
Number of hits by IP 140.211.167.204.
IP with the largest number of hits and its number of hits.
IP with the smallest number of hits and its number of hits (note, this is the smallest of the top-100).
Average number of hits per IP (include only the 100 IPs mentioned before).
Your program must have a simple Windows Forms-based interface (see examples below).
The program must use the built-in OpenFileDialog file browser class for selecting a log file.
After displaying the results for a log file, the program must allow a user to pick up the same or another log file and compute the results for that file without having to (re)start the application.
Make sure that as you pick up additional log files, the results from the previous log files are cleared out from the interface.
Define and use a class hitter which holds both an IP and a hit count.
Store each hitter as an object in an array hitters containing x hitter objects.
Make sure that the computed stats are correct (the example below is for the Jan. 15 data).
As always, make your program robust and perform error checking and exception handling, e.g.,
The Go button is pushed but no file has been selected.
The Go button is pushed but the text in the Select log file TextBox is not a file (could have been typed in).
The selected file is not a log file. Note: do not(!!) rely on the file extension (.txt).
A few hints
It is a good idea to access storage/file space as little as possible and to do all computations in memory. Hence, we recommend that you read the entire log file into memory before processing its data. A nice and handy way of doing this is to read the file in so that each line in the log file becomes a string in an array of strings. Here's the code:
One of the requirements of this assignment is that you work with a class hitter which contains both the IP address and the number of requests made by that IP:
Once you have the hitter class, it makes sense to declare an array of hitters so that you can store a bunch of hitter objects. Let's say that you have n_hitters hitters:
Once you have the hitters array, you can allocate memory for individual hitter objects and store them in the array. Let's do the i-th hitter object:
SOMETHING TO NOTE!!!! Allocating the array of hitters DOES NOT ALLOCATE space for the objects themselves; it only allocates space for references to hitter objects. Hence, the line
does two things: it allocates memory for a hitter object and it stores a reference to that hitter object in the hitters array at location 0.
Another way to put this: if you want an array of objects, you must allocate memory space for the array first and then allocate memory space for each of the objects in the array.
You must obviously find the IP of each requester in each line of the log file. There are several ways of doing this, but a nice and easy way is to split the line in the log file over its spaces into an array of strings; i.e., the single string representing the line in the log file (which is now an element in the lines array) will itself be made into an array of strings of which the second element always holds the IP address. In code:
Since you must put out a sorted list of IPs, at one point or another you must sort the hitter array in descending order of hit count. You can write this sorting code yourself or use the following one-line mechanism. (Note: this assumes a hitter array called hitters which is full; i.e., all elements in the array are hitter objects):
Please note!!! IPs which have the same number of hits do not themselves have to be sorted!!!
If you point the program to a file which is not a log file, your program should detect that. An imperfect but good-enough way to do that is to check to see if what it picks up as an IP address is actually an IP address. Here's the code for that:
Review grading criteria
Explanation / Answer
//FileURLDemo.java
/**
* This program demonstrates URL class.
*/
import java.io.*;
import java.net.*;
public class FileURLDemo
{
public static void main(String args[])
{
try
{
File file = new File("URLDemo.java");
String filePath = "file:///" + file.getAbsolutePath();
URL fileURL = new URL(filePath);
InputStream in = fileURL.openStream();
int data;
while((data = in.read()) != -1)
{
System.out.print((char)data);
}
in.close();
}
catch(Exception e)
{
System.out.println("Exception: " + e);
}
}
}
//InetAddressDemo.java
/**
* This program demonstrates InetAddress class.
*/
import java.net.InetAddress;
public class InetAddressDemo
{
public static void main(String args[])
{
try
{
InetAddress add = InetAddress.getLocalHost();
System.out.println(" Local Host Details : " + add);
System.out.println("The Host IP Address is : " + add.getHostAddress());
System.out.println("The Host name is : " + add.getHostName());
add = InetAddress.getByName("Reflection16");
System.out.println(" Local Host: " + add);
}
catch(Exception e)
{
System.out.println("Exception: " + e);
}
}
}
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.