Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

3. Icicles form on building eaves and trees in very cold environments. Scientist

ID: 3047970 • Letter: 3

Question

3. Icicles form on building eaves and trees in very cold environments. Scientists wanted to know more about the growth of these icicles. An icicle was grown in a cold chamber at- 1 1°C with no wind and a water flow of 1 1.9 mg/sec. The length (in cm) of the icicle was measured as a function of time (in minutes). The data is given below: Length (cm) 0.6 Time (min 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 6.1 7.9 10.1 10.9 12.7 14.4 6.6 18.1 19.9 21 23.4 24.7 27.8 a) Verify that linear mode l is not appropriate for this dataset. b) Explore the data given above and determine on an appropriate regression model for it. Fit your choice of the regression model to the data and determine the goodness of fit. You may need to apply transformation to the data set in order to obtain a suitable regression model. You may repeat the process of exploration, transformation, model fitting and model checking for more than one tme, either iteratively or simultaneously. But you only need to present your best fit model in the hardcopy submission. However, please keep your R code for all the transformation, model fitting and model checking, that you have applied (if any in you final r file submission.

Explanation / Answer

a) First check if there is linear relationship between length(y) and time(t) .It is clear from the graph that linearity exists. Then fit the linear regression model and find the residuals. Plot the residual plot and histogram of the residuals . The R code is given by :-

# original model
t=seq(10,180,by=10)
y=c(0.6,1.8,2.9,4,5,6.1,7.9,10.1,10.9,12.7,14.4,16.6,18.1,19.9,21,23.4,24.7,27.8)
plot(t,y)
l1=lm(y~t)
res1=residuals.lm(l1)
plot(res1,predict(l1))
hist(res1)

From the graph you can see that residuals are not randomly distributed and histogram does not show a bell shaped curve. So we cannot use linear regression as the residuals does not follow normal assumption. So we need the transformation of the response variable.

b) Here we will go for Box-Cox transformation which is given by new_response = ((old_response^ lamda ) - 1) / lamda where lamda is not equal to zero. If lamda is zero then new_response = log(old_response). The R code is given by

# log-transformation
y1=log(y)
l2=lm(y1~t)
res2=residuals.lm(l2)
plot(res2,predict(l2))
hist(res2)

# Box-Cox transformation
f=function(lamda)
{
t=seq(10,180,by=10)
y2=((y**lamda)-1)/(lamda)
lm(y2~t)
}


ssres=NULL
lamda=seq(-4,4,0.1)
for(i in 1:length(lamda))
{ if(lamda[i]!=0)
{ ssres[i]=sum((resid(f(lamda[i])))**2)}
else{ ssres[i]=10000000}
i=i+1
}
mins=ssres[which(ssres==min(ssres))]
p=which(ssres==mins)
lamda[p]


# main model
ynew=((y**0.7)-1)/0.7
lnew=lm(ynew~t)
plot(t,ynew)
resnew=residuals.lm(lnew)
plot(resnew,predict(lnew))
plot(lnew)
shapiro.test(resid(lnew))
bptest(lnew)
hist(resnew)
summary(lnew)

The log transformation is not good as the residuals plot is not random and histogram is highly skewed. The lamda[p] command will give us the perfect value of lamda for transformation of the response variable.It gives that lamda = 0.7 is perfect transformation.Lamda is choose best for which residual is minimum. For lamda =0 I assume a high variace to eliminate it. Then fitting the linear regression model we get the R-square value as 0.9978.

The linear regression model is given by : y = -0.84435 + 0.07794 * t

Now check the residual plot and histogram.It seems that residuals are random and histogram is bell-shaped.So assumption of normality and error homoscedasticity are met.

For the graphs please run the R code given in the answer for each case and observe the results.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote