I am using R studio and wanted to be sure i am inputing the right formulas, than
ID: 3067412 • Letter: I
Question
I am using R studio and wanted to be sure i am inputing the right formulas, thank you in advance.
Simple linear regression
Is there a relationship between the ages of husband and wife? You will use the data set to decide
whether the age of the wife can be predicted from the age of the husband in a married couple.
(a) Load the data from hwageA01.txt (4 points)
(Note that this is a text file, so use the appropriate instruction. If you are having trouble uploading the
data, open it to see its contents and type the data in: one vector for heights and one vector for stories.
Ignore the year data.)
The data are ages of husband and wives (in years) from a randomly selected group of married couples.
HAge WAge
49 43
25 . 28
40 . 30
52 57
58 . 52
32 . 27
43 52
42 39
47 43
31 23
(b) Draw a scatterplot with age of husband in the x-axis and age of wife in the y-axis. Does there seem to
be a linear relationship between the two variables? (4 points)
(c) Find the linear correlation coefficient between these variables. What does it tell you about the linear
relationship? (3 points)
(d) Obtain the linear model and summary. Write down the regression equation that relates age of a
husband with age of his wife. Add the line to the scatterplot. (8 points)
(e) Test for significance of the regression at ? = 0.05. State the null and alternative hypotheses. Can the
model be used for predictions? Justify your conclusion using the summary in (d). (8 points)
(f) State the coefficient of determination. What percentage of variation in the age of a wife is explained
by the age of her husband? (3 points)
(g) Draw diagnostic plots (a plot of the x-variable vs. residuals, and a normal probability plot for the
residuals). Do assumptions appear to be satisfied? (8 points)
Explanation / Answer
a. loading data: data=read.txt("hwageA01.txt", header=T)
b. Formula for scatterplot: plot: plot(HAge,WAge, main= "Wife's Age vs Husband's Age")
c. Linear correlation coefficient: cor(HAge, WAge)
A positive value of this implies a very positive dependence of wife's age on the husband's age. The larger the posotive value the more the dependency. The negative correlation implies that the two variable have a negative dependency, that means if one variable increases, the other variable decreses in value.
d. Linear Model: fit= lm(WAge~HAge)
Summary: summary(fit)
The values of a and b are outputted by summary. The linear regression equation becomes WAge=a+b*WAge
for the regression line:
e. When you do the summary, you will find the p- value at the bottom of the summary. If the p-value is less than 0.05, we reject null hypothesis.
Null: The two variables are not associated.
Alternate: Null is false.
If null is rejected then we can use the regression for prediction.
f. coefficient of determination = r squared
formula: summary(fit)$r.squared
percentage determined= r-squared*100.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.