Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

You are given three data files (data1.txt, data2.txt, and data3.txt) consisting

ID: 3590098 • Letter: Y

Question

You are given three data files (data1.txt, data2.txt, and data3.txt) consisting of integers (1 integer on each line). Write Python code that reads each of these three files, computes sample mean and standard deviation for each data set and write them on a file, output.txt. For this problem, you should import simple_ds.py and use the mean function rather than including the code directly in your script. Next, combine the data of the three data sets in these data files and draw a histogram as well as print the result (combined data) on the screen. The output to the screen must be in a reader friendly format. Use matplotlib (as demonstrated in hist4.py) to plot a histogram of the combined data and not matlab. Save the plot in a .png format. Submission for this problem: Python code + Screen capture of cmd + histogram(.png)

Data 1

1
2
3
4
5

8
9
9
10
0
12
1
1
1
2
0
2
3
3

Data 2

4
5
1
2
7
8
8
9
0
1
2
3
4
5

Data 3

1
2
3
1
1
6

10
23
11
11
24
55

hist4.py

# -------------------------------------

# hist4.py - reads a file consisting of

# integers (1 integer on each line) and

# counts the number of occurances of

# each value (a histogram) and generates a

# histogram plot using matplotlib.

#

# 2015-09-02 - jeff smith

#

# $Id: $

# -------------------------------------

import matplotlib.pyplot as plt

# read the values

vals = [int(i.rstrip()) for i in open('data.txt','r') if i.rstrip()]

# dictionary to hold the counts

hist = {}

# loop through each unique value

for i in range(min(vals),max(vals)+1):

# key = integer, value = count

hist[i] = vals.count(i)

# get and sort the keys

skeys = hist.keys()

skeys.sort()

# display

for key in skeys:

print "{:3d} : {}".format(key, hist[key])

# Graphical version

plt.figure(1, figsize=(5,3))

plt.yticks(fontsize=8)

plt.xticks(fontsize=8)

plt.hist(vals, bins=15, normed=False)

plt.title('Observed Counts',fontsize=10)

plt.show()

Explanation / Answer

#code to be copied

import matplotlib.pyplot as plt

import simple_ds.py # assuming mean calculates mean of list of values

#naming all the files

filename1 = "data1.txt"

filename2 = "data2.txt"

filename3 = "data3.txt"

filename4 = "output.txt"

#opening the files for read operations

file1 = open( filename1, "r" )

file2 = open( filename2, "r" )

file3 = open( filename3, "r" )

#opening file for write operation

file4 = open( filename4, "w" )

#to store all the integers in the file

vals = []

#loops to append values to vals

#and writing mean to output.txt

for line in file1:

vals.append(int(line))

file4.write(mean(vals))

vals = []

for line in file2:

vals.append(int(line))

file4.write(mean(vals))

vals = []

for line in file3:

vals.append(int(line))

file4.write(mean(vals))

#hist to store counts of each variable value

hist = {}

vals = [int(i.rstrip()) for i in file1 if i.rstrip()]

for i in range( min(vals), max(vals)+1 ):

hist[i] = vals.count(i)

vals = [int(i.rstrip()) for i in file2 if i.rstrip()]

for i in range( min(vals), max(vals)+1 ):

hist[i] = hist[i] + vals.count(i)

vals = [int(i.rstrip()) for i in file3 if i.rstrip()]

for i in range( min(vals), max(vals)+1 ):

hist[i] = hist[i] + vals.count(i)

skeys = hist.keys()

skeys.sort()

vals = []

for key in skeys:

print "{:3d}:{}".format( key, hist[key] )

vals.append(hist[key])

#Graphical Output

plt.figure( 1, figsize = (5,3) )

plt.yticks( fontsize = 8 )

plt.xticks( fontsize = 8 )

plt.hist( vals, bins = 15, normed = False )

plt.title('Histogram', fontsize = 10 )

plt.show()

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote