In R studio , create a dataset ( using `data.frame `) with two continuous variab
ID: 3336460 • Letter: I
Question
In R studio, create a dataset (using `data.frame`) with two continuous variables of length 20, and one factor with 2 levels (a binomial variable), with 10 replicates each.
For the continuous variables, either enter them manually or use a random number generating a function that is appropriate to generate continuous variables.
Give the variables appropriate names and explain where the data set could have originated from / what experiment could have resulted in this data set.
5. Using the data set created in (4), test whether the two continuous variables are correlated. Include a null hypothesis, test for normality, use the appropriate test, and conclude, reporting the exact p-value.
Can someone please help me with this. Thank you.
Explanation / Answer
1. The code is given as follows:
#Creating the dataset
Height<-c(100,110,122,120,119,132,140,130,123,145,109,133,156,134,155,124,136,127,138,129)
Weight<-c(65,67,69,70,72,65,56,45,76,68,60,58,59,65,67,68,61,64,57,58)
Indicator<-c(0,0,1,1,1,0,1,0,0,1,0,1,1,1,0,1,0,0,1,0)
dataset<-data.frame(Height,Weight,Indicator)
dataset
#Correlation Test
cor.test(Height,Weight,method = "pearson")
#Test for normality
shapiro.test(Height)
shapiro.test(Weight)
2. Explanation of the dataset:
The two continuous variables are Height and Weight of 20 employees in a company. The Indicator variable (the factor variable with 2 levels of 10 replicates each) is the indicator of them complying to a diet. 1 denotes the employee is on the diet and 0 denotes the employee is not on the diet.
All the variables are manually entered.
3. Test for correlation between the continuous variables:
Null Hypothesis: The variables are not correlated with each other vs Alternative Hypothesis: The variables are correlated with each other
Result:
Pearson's product-moment correlation
data: Height and Weight
t = -1.0872, df = 18, p-value = 0.2913
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.6223890 0.2182587
sample estimates:
cor
-0.2482379
Interpretation: The p-value is greater than 0.05 and hence we may accept the null hypothesis and conclude that the variables are not correlated with each other.
4. Tests for normality:
We use the Shapiro Wilk Test for normality. This is the test generally used for testing normality of two continuous variables.
Null Hypothesis: The variable follows Normal population vs Alternative Hypothesis: The variable does not follow Normal population
Result:
Shapiro-Wilk normality test
data: Height
W = 0.98056, p-value = 0.9411
Shapiro-Wilk normality test
data: Weight
W = 0.95358, p-value = 0.4247
Interpretation: For both Height and Weight, the p-values are greater than 0.05 (level of significance) which implies that the we may accept the null hypotheses and conclude that both the variables come from or follow Normal distributions.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.