Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

PYTHON Write a Python program that will read in a two-column comma-separated val

ID: 643673 • Letter: P

Question

PYTHON Write a Python program that will read in a two-column comma-separated value (CSV) file and do a linear regression You should do the regression calculation yourself. You should check your results with either np.polyfit or np.linalg. lstq  but you should (a) either calculate the normal matrix and solve the system yourself, or (b) calculate the sample averages, standard deviations, and correlations yourself, and not use built in functions for these operations  On a single figure, make a scatter plot of the original data points and a line plot of linear regression of y on x Have the program automatically write the solution equation y=mx+b somewhere on the plot

here are my values from the txt file since i cant attach a fie

1,0.9984489205
2,0.9752788428
3,0.8769402712
4,0.6284264418
5,0.1772490897
6,-0.4210076698
7,-0.9162875775
8,-0.9116652003
9,-0.1990087588
10,0.7565584253
11,0.8973685995
12,-0.1666463731
13,-0.9999420786
14,-0.0774674983
15,0.9994582237

Explanation / Answer

Program Code:

import math

import csv

#import numpy as np

#import matplotlib.pyplot as plt

#definition of the SampleAverages

def SampleAverages(avg=[]):

   

    total=0;

    size=len(avg)

    for x in range(len(avg)):

        total=total+avg[x]

   

    average=total/size

    return average

#definition of the StandardDeviation

def StandardDeviation(m,points=[]):

    mean=m

    s=0;

    for i in points:

        s=s+math.pow((i-mean),2)

   

    sd=0

    sd=math.sqrt(s/(len(points)-1))

    return sd

#definition of the slopeValue

def slopeValue(xm, ym, xV=[], yV=[]):

   

    b=0

    xysum=0

    xsq=0

    for i in range(len(xV)):

        xysum=xysum+(xV[i]*yV[i])

        xsq=xsq+math.pow(xV[i],2)

   

    b=(xysum-(len(xV)*xm*ym))/(xsq-len(xV)*math.pow(xm,2))

    return b

   

#definition of the interceptValue  

def interceptValue(ymean, slope, xmean):

    yinter=0

    yinter=ymean-(slope*xmean)

    return yinter

#definition of the correlationValue      

def correlationValue(xmean, ymean,xdeviation,ydeviation, xValues, yValues):

    xco=0

    yco=0

    xyco=0

    corelation=0

    for i in range(len(xValues)):

        xco=((xValues[i]-xmean)/xdeviation)

        yco=((yValues[i]-ymean)/ydeviation)

        xyco=xyco+(xco*yco)

    corelation=xyco/(len(xValues)-1)

    return corelation

   

#read data from the CSV file Input.csv

f = open('Input.csv')

csv_f = csv.reader(f)

yValues=[]

xValues=[]

#read the values from the csv file: Input.csv

for row in csv_f:

    #append the data as x coordinates and y coordinates

    xValues.append(int(row[0]))

    yValues.append(float(row[1]))

#print the x-y coordinates

for i in range(len(xValues)):

    print(xValues[i], " ",yValues[i] )

#find the mean of x coordinates and y coordinates by calling the

#funciton SampleAvearges and print the values

xmean=SampleAverages(xValues)

ymean=SampleAverages(yValues)

print("The mean of X= ", round(xmean,2))

print("The mean of Y= ", round(ymean,2))

#find the standard deviation of x coordinates and y coordinates by calling the

#funciton StandardDeviation and print the values

xdeviation=StandardDeviation(xmean, xValues)

print("The standard deviation of X= ", round(xdeviation,3))

ydeviation=StandardDeviation(ymean, yValues)

print("The standard deviation of Y= ", round(ydeviation,3))

#find the slope of the line by calling the slopeValue function and print the value

slope=slopeValue(xmean, ymean, xValues, yValues)

print("The slope of the line is: ", round(slope,3))

#find the y-intercept of the line by calling the interceptValue function and print the value

intercept=interceptValue(ymean, slope, xmean)

print("The y-intersept value = ", round(intercept,3))

print(" ")

equation=""

equation="y = "+str(round(slope,3))+"x + "+str(round(intercept,3))

#find the correlation of the line and print the value

correlation=correlationValue(xmean, ymean,xdeviation,ydeviation, xValues, yValues)

print("The correaltion value = ", round(correlation,3))

#finally print the value of the line

print("Therefore, the equation of the line is ", equation)

'''

# calculate polynomial

z = np.polyfit(xValues, yValues, 3)

f = np.poly1d(z)

# calculate new x's and y's

x_new = np.linspace(xValues[0], xValues[-1], 50)

y_new = f(x_new)

plt.plot(xValues,yValues,'o', x_new, y_new)

plt.xlim([xValues[0]-1, x[-1] + 1 ])

plt.show()'''

Sample Input file: Input.csv

sh-4.2# python3 main.py                                                                                                                                                  

1         0.9984489205                                                                                                                                                   

2         0.9752788428                                                                                                                                                   

3         0.8769402712                                                                                                                                                   

4         0.6284264418                                                                                                                                                   

5         0.1772490897                                                                                                                                                   

6         - 0.4210076698                                                                                                                                           

7         -0.9162875775                                                                                                                                                  

8         -0.9116652003                                                                                                                                                  

9         -0.1990087588                                                                                                                                                  

10        0.7565584253                                                                                                                                                   

11        0.8973685995                                                                                                                                                   

12        -0.1666463731                                                                                                                                                  

13        -0.9999420786                                                                                                                                                  

14        -0.0774674983                                                                                                                                                  

15        0.9994582237                                                                                                                                                   

The mean of X=  8.0                                                                                                                                                      

The mean of Y=  0.17                                                                                                                                                     

The standard deviation of X=  4.472                                                                                                                                      

The standard deviation of Y=  0.755                                                                                                                                      

The slope of the line is:  -0.049                                                                                                                                        

The y-intersept value =  0.564                                                                                                                          

The correaltion value =  -0.289                                                                                                                                          

Therefore, the equation of the line is  y = -0.049x + 0.564