Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

* Descriptive statistics: o Find the mean, median, mode, range, and standard dev

ID: 2907028 • Letter: #

Question

* Descriptive statistics:

o Find the mean, median, mode, range, and standard deviation for the data set.

o Create a scatter plot for the data set.

* Regression Analysis:

o Perform a linear regression analysis onto the data set.

o Report the correlation coefficient, the equation of the regression function, and make a few predictions base on hypnotical input values.

o Write down a summary of your conclusions (How well does the regression fit the values? How correlated are the independent and dependent variables? What does the prediction tell you?).

X Y 108 392.5 19 46.2 13 15.7 124 422.2 40 119.4 57 170.9 23 56.9 14 77.5 45 214 10 65.3 5 20.9 48 248.1 11 23.5 23 39.6 7 48.8 2 6.6 24 134.9 6 50.9 3 4.4 23 113 6 14.8 9 48.7 9 52.1 3 13.2 29 103.9 7 77.5 4 11.8 20 98.1 7 27.9

Explanation / Answer

###R code

x=c(108,19,13,124,40,57,23,14,45,10,5,48,11,23,7,2,24,6,3,23,6,9,9,3,29,7,4,20,7)

x

y=c(392.5,46.2,15.7,422.2,119.4,170.9,56.9,77.5,214,65.3,20.9,248.1,23.5,39.6,48.8,6.6,134.9,50.9,4.4,113,14.8,48.7,52.1,13.2,103.9,77.5,11.8,98.1,27.9)

y

####Descriptive Statistics#########

###For X

summary(x)

mean(x)

mode(x)

median(x)

range(x)

a=var(x)

sd=sqrt(a)###standard deviation

sd

####For Y

summary(y)

mean(y)

mode(y)

median(y)

range(y)

a=var(y)

sd=sqrt(a) ###standard deviation

sd

### a scatter plot

plot(x,y)

###a linear regression analysis#############

fit=lm(y~x)

summary(fit)

###the correlation coefficient, the equation of the regression function

cor(x,y)

#####answer###

> x=c(108,19,13,124,40,57,23,14,45,10,5,48,11,23,7,2,24,6,3,23,6,9,9,3,29,7,4,20,7)

> x

[1] 108 19 13 124 40 57 23 14 45 10 5 48 11 23 7 2 24 6 3

[20] 23 6 9 9 3 29 7 4 20 7

> y=c(392.5,46.2,15.7,422.2,119.4,170.9,56.9,77.5,214,65.3,20.9,248.1,23.5,39.6,48.8,6.6,134.9,50.9,4.4,113,14.8,48.7,52.1,13.2,103.9,77.5,11.8,98.1,27.9)

> y

[1] 392.5 46.2 15.7 422.2 119.4 170.9 56.9 77.5 214.0 65.3 20.9 248.1

[13] 23.5 39.6 48.8 6.6 134.9 50.9 4.4 113.0 14.8 48.7 52.1 13.2

[25] 103.9 77.5 11.8 98.1 27.9

> ####Descriptive Statistics#########

> ###For X

>

> summary(x)

Min. 1st Qu. Median Mean 3rd Qu. Max.

2.0 7.0 13.0 24.1 24.0 124.0

> mean(x)

[1] 24.10345

> mode(x)

[1] "numeric"

> median(x)

[1] 13

> range(x)

[1] 2 124

> a=var(x)

> sd=sqrt(a)

> sd

[1] 29.37728

> ####For Y

>

> summary(y)

Min. 1st Qu. Median Mean 3rd Qu. Max.

4.40 23.50 52.10 93.77 113.00 422.20

> mean(y)

[1] 93.76897

> mode(y)

[1] "numeric"

> median(y)

[1] 52.1

> range(y)

[1] 4.4 422.2

> a=var(y)

> sd=sqrt(a)

> sd

[1] 106.1344

> ### a scatter plot

> plot(x,y)

> summary(fit)

Call:

lm(formula = y ~ x)

Residuals:

Min 1Q Median 3Q Max

-50.335 -18.669 -6.492 18.836 71.300

Coefficients:

Estimate Std. Error t value Pr(>|t|)   

(Intercept) 10.0193 7.1630 1.399 0.173   

x 3.4746 0.1905 18.242 <2e-16 ***

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 29.61 on 27 degrees of freedom

Multiple R-squared: 0.925, Adjusted R-squared: 0.9222

F-statistic: 332.8 on 1 and 27 DF, p-value: < 2.2e-16

> cor(x,y)

[1] 0.9617441

###conclusions:

R-sequared Value is 0.925 that means the model explains 92.5% variation

in the data ,thus our model is well fitted

& the correlation of our data is perfect positive

thus,our model is adequate.