Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1) Generate a data set with three variables (X, Y and Z). X and Y have 10 observ

ID: 2909026 • Letter: 1

Question

1) Generate a data set with three variables (X, Y and Z). X and Y have 10 observations for each (N=10), and Z has 13 observations (N=13). Each observation should have two digits (such as “83” or “8.3”).

2) Draw a stem-and-leaf display for variable Z only and draw a box plot display for variable Z after specifying the 5 numbers (UEX, LEX, FU, FL, MD).

3) Calculate the mean and standard deviation for variable X

4) Calculate the mean and standard deviation for variable Y

5) In order to predict Y from X, we need to set up a regression equation: (a) Calculate two regression constants (slope and y-intercept) and (b) present the equation.

6) As you have the mean for variable X and Y (from questions 3 and 4 above), once you have the mean for variable Z, can you obtain the mean for the entire data set by computing the mean of the three means? Why or why not? Explain.

Explanation / Answer

####   (1) ####
X=round(rnorm(10,50,5),2)
Y=round(rnorm(10,50,5),2)
Z=round(rnorm(13,50,5),2)
######## (2) ######
stem(Z)

The decimal point is 1 digit(s) to the right of the |

4 | 4
4 | 677899
5 | 02
5 | 668
6 | 0

boxplot(Z)


summary(Z)

    min    1st Qu.Median Mean   3rd Qu. Max.
43.73   47.39   49.06   50.90   55.50   60.05

mean(X)
[1] 49.683
> sd(X)
[1] 5.123309
> ###########(4) #####
> mean(Y)
[1] 50.704
> sd(Y)
[1] 5.522938

######### (5) ####
L=lm(Y~X)
summary(L)

Call:
lm(formula = Y ~ X)

Residuals:
     Min       1Q   Median       3Q      Max
-10.7655 -0.7191   0.0195   2.8518   5.9810

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept) 80.4052    15.8325   5.078 0.000955 ***
X            -0.5978     0.3172 -1.885 0.096175 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.875 on 8 degrees of freedom
Multiple R-squared: 0.3075,    Adjusted R-squared: 0.221
F-statistic: 3.553 on 1 and 8 DF, p-value: 0.09617


Y = 80.4 - 0.59 X

######### (6) ####
> mean(Z)
[1] 50.9
> mean(mean(X), mean(Y), mean(Z))
[1] 49.683