Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

You are given three data files (data1.txt, data2.txt, and data3.txt) consisting

ID: 3590325 • Letter: Y

Question

You are given three data files (data1.txt, data2.txt, and data3.txt) consisting of integers (1 integer on each line). Write Python code that reads each of these three files, computes sample mean and standard deviation for each data set and write them on a file, output.txt. For this problem, you should import simple_ds.py and use the mean function rather than including the code directly in your script. Next, combine the data of the three data sets in these data files and draw a histogram as well as print the result (combined data) on the screen. The output to the screen must be in a reader friendly format. Use matplotlib (as demonstrated in hist4.py) to plot a histogram of the combined data and not matlab.

Data set 1:

1
2
3
4
5

8
9
9
10
0
12
1
1
1
2
0
2
3
3

Data set 2:

4
5
1
2
7
8
8
9
0
1
2
3
4
5

Data set 3:

1
2
3
1
1
6

10
23
11
11
24
55

simple_ds.py:

# mean() - compute the sample mean.
# Parameters:
# N a list of numbers
#
def mean( N ):
# running total
Total = 0
# count of the number of items
Count = len(N)
# for each item in the list
for Num in N:
# increment the total
Total = Total + Num
# compute the sample average
average = float(Total)/Count if Count > 0 else 0
return(average)

#
# std_dev() - compute the sample standard deviation.
# Parameters:
# N a list of numbers
#
def std_dev( N ):
Count = len(N)
# Compute the average
average = mean(N)
if Count > 1:
# Compute the std dev.
Total = 0
for Num in N:
Total = Total + (float(Num) - average)**2
std_dev = ((float(1)/(Count-1))*Total)**(float(1)/2)
else:
std_dev = 0
return(std_dev)

# default main() for command-line execution.
def main():
Numbers = [123, 87, 96, 24, 104, 16, 85, 55, 62, 109]
# display output
print 'Numbers: {}'.format(Numbers)
the_avg = mean(Numbers)
print 'Average: {:.3f}'.format(the_avg)
the_std_dev = std_dev(Numbers)
print 'Standard deviation: {:.3f}'.format(the_std_dev)

# This line causes main() to be executed if this module
# is executed without an import (i.e., from the command line).
# If the module is imported, the condition will fail.
if __name__ == "__main__": main()

Explanation / Answer

Part One

from simple_ds import mean,std_dev

file_names_list = ['data_set1.txt','data_set2.txt','data_set3.txt']
individual_all_data = []
for _ in file_names_list:
    with open(_,"r") as f:
        all_data = f.readlines()
        for i in range(len(all_data)):
            all_data[i] = all_data[i].strip()
        all_data = list(map(int,all_data))

        individual_all_data.append(all_data)

# print(individual_all_data)

for _ in individual_all_data:
    mean_value = mean(_)
    std_dev_value = std_dev(_)
    with open("output.txt","a") as o:
        o.write("{} {} ".format(mean_value,std_dev_value))

Part Two

import matplotlib.mlab as mlab
import matplotlib.pyplot as plt

file_names_list = ['data_set1.txt','data_set2.txt','data_set3.txt']
theta = []
for _ in file_names_list:
    with open(_,"r") as f:
        all_data = f.readlines()
        for i in range(len(all_data)):
            theta.append(int(all_data[i].strip()))


num_bins = 500
# the histogram of the data
n, bins, patches = plt.hist(theta, num_bins, range=[0,50], normed = True, histtype='bar',facecolor='green')
plt.xlabel(r'X-axis')
plt.ylabel(r'Y-axis')
plt.show()

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote