Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Reanalyze the data in Exercise 12.20 with number of native species as the respon

ID: 3255239 • Letter: R

Question

Reanalyze the data in Exercise 12.20 with number of native species as the response, but using log-linear regression. (a) Fit the model with log area, log elevation, log of distance from nearest island, and log area of nearest island as explanatory variables; and then check for extra- Poisson variation. (b) Use backward elimination to eliminate insignificant explanatory variables. (c) Describe the effects of the remaining explanatory variables.

https://github.com/jarad/jarad.github.com/blob/master/courses/stat401A/lab/sleuth3csv/ex2118.csv

In R:

library(Sleuth3)

ex2218

20. Galapagos Islands. The data in Display 12.17 come from a 1973 study. Data from M. P Johnson and P H. Raven, "Species Number and Endemism: The Galapagos Archipelago Revisited," Science 179 (1973): 893-5.) The number of species on an island is known to be related to the island's area. Of interest is what other variables are also related to the number of species, after island area is accounted for, and whether the answer differs for native species and nonnative species. (Note Elevations for five of the islands were missing and have been replaced by estimates for purposes of this exercise.

Explanation / Answer

There are no pointers on the previous excercise , however we shall do this based on the given information in the question. Also the question refers to file ex1220 from the slueth3 package

Please see the complete R snippet , we shall use the glm function to fit the log linear model using the poisson family

library(Sleuth3)
data("ex1220")
ex1220

## FIT THE model
mod0 <- glm(Total ~ Area+ Elev +DistNear +AreaNear,
data = ex1220, family = poisson)
summary(mod0)

## check the deviance
pchisq(deviance(mod0), df = df.residual(mod0), lower.tail = F)


## run the backward elemination to check for the insignifcant variables
step(mod0,direction="backward",trace=TRUE)

############################

The results are

> summary(mod0)

Call:
glm(formula = Total ~ Area + Elev + DistNear + AreaNear, family = poisson,
data = ex1220)

Deviance Residuals:
Min 1Q Median 3Q Max
-11.1537 -4.3318 -0.7932 2.3551 9.9384

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.998e+00 4.973e-02 60.285 <2e-16 ***
Area -5.837e-04 2.574e-05 -22.676 <2e-16 ***
Elev 3.606e-03 8.588e-05 41.992 <2e-16 ***
DistNear -3.328e-03 1.445e-03 -2.303 0.0213 *
AreaNear -7.596e-04 2.786e-05 -27.271 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for poisson family taken to be 1)

Null deviance: 3510.73 on 29 degrees of freedom
Residual deviance: 812.97 on 25 degrees of freedom
AIC: 983.8

Number of Fisher Scoring iterations: 5

We see that all the variables are significant as the p value is less than 0.05 , hence we conclude that the explanatory variables used for the model are statistically signifcant

The stepwise regression stops after the 1st step as it finds all variables to be signifcant and thus cant remove any explanaoty variable

> step(mod0,direction="backward",trace=TRUE)
Start: AIC=983.8
Total ~ Area + Elev + DistNear + AreaNear

Df Deviance AIC
<none> 812.97 983.8
- DistNear 1 818.37 987.2
- Area 1 1324.29 1493.1
- AreaNear 1 1801.10 1969.9
- Elev 1 2587.27 2756.1

Call: glm(formula = Total ~ Area + Elev + DistNear + AreaNear, family = poisson,
data = ex1220)

Coefficients:
(Intercept) Area Elev DistNear AreaNear
2.9982177 -0.0005837 0.0036061 -0.0033281 -0.0007597

Degrees of Freedom: 29 Total (i.e. Null); 25 Residual
Null Deviance:   3511
Residual Deviance: 813    AIC: 983.8