Logistic regression. The data le harrell.csv contains data on 40 people. The var
ID: 3312419 • Letter: L
Question
Logistic regression. The data le harrell.csv contains data on 40 people. The variables are
age in years, gender, a categorical variable with two levels, female and male, and response, a 0/1
indicator variable for whether the person responded to a medical treatment (1 means that the person
responded). Fit a logistic regression model with response as the dependent variable and age and
gender as the independent variables.
(a) How does the probability of response change for a 42-year-old male compared to a 52-year-old
male?
(b) Which gender has a higher probability of response to the medical treatment?
(c) What is the e ect on the odds of response for a one-year increase in age?
(d) Make a plot of the probability of response as a function of age, with one curve for females and
one curve for males.
Explanation / Answer
Rstudio-code
data=read.table("D:\data.txt",header=TRUE)
data
X1=data[,1]
X1
X2=as.factor(data[,2])
X2
Y=data[,3]
Y
lm.fit=glm(Y~X1+X2,family=binomial(link="logit"),)
lm.fit
beta=lm.fit$coefficients
beta
Routput-
data=read.table("D:\data.txt",header=TRUE)
> data
age gender response
1 37 0 0
2 39 0 0
3 39 0 0
4 42 0 0
5 47 0 0
6 48 0 0
7 48 0 1
8 52 0 0
9 53 0 0
10 55 0 0
11 56 0 0
12 57 0 0
13 58 0 0
14 58 0 1
15 60 0 0
16 64 0 0
17 65 0 1
18 68 0 1
19 68 0 1
20 70 0 1
21 34 1 1
22 38 1 1
23 40 1 0
24 40 1 0
25 41 1 0
26 43 1 1
27 43 1 1
28 43 1 1
29 44 1 0
30 46 1 0
31 47 1 1
32 48 1 1
33 48 1 1
34 50 1 0
35 50 1 1
36 52 1 1
37 55 1 1
38 61 1 1
39 61 1 1
40 61 1 1
> X1=data[,1]
> X1
[1] 37 39 39 42 47 48 48 52 53 55 56 57 58 58 60 64 65 68 68 70 34 38 40 40 41 43 43
[28] 43 44 46 47 48 48 50 50 52 55 61 61 61
> X2=as.factor(data[,2])
> X2
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Levels: 0 1
> Y=data[,3]
> lm.fit=glm(Y~X1+X2,family=binomial(link="logit"),)
> lm.fit
Call: glm(formula = Y ~ X1 + X2, family = binomial(link = "logit"))
Coefficients:
(Intercept) X1 X21
-9.8302 0.1578 3.4849
Degrees of Freedom: 39 Total (i.e. Null); 37 Residual
Null Deviance: 55.45
Residual Deviance: 38.9 AIC: 44.9
> beta=lm.fit$coefficients
> beta
(Intercept) X1 X21
-9.8301685 0.1578414 3.4849454
(1)
approprite fitting model,
log(p/1-p)=-9.8301685 +0.1578414X1+ 3.4849454X2=eta
p/(1-p)=exp(eta)
(1-p)/p=1/exp(eta)
1/p=(1/exp(eta))+1
1/p=(1+exp(eta))/exp(eta)
p=exp(eta)/(1+exp(eta))
Where, X2 =1 if male and 0 if female
p=exp(-9.8301685 +0.1578414X1+ 3.4849454X2/(1+exp(-9.8301685 +0.1578414X1+ 3.4849454X2))
hence for male;
at age of 42
p=0.5705549
at age of 52 p=0.865591
Hence, as age increase probability of response also increase.
(b)
p=exp(-9.8301685 +0.1578414X1+ 3.4849454X2/(1+exp(-9.8301685 +0.1578414X1+ 3.4849454X2))
where x2=1 if male 0 if female
for male probability of response is higher then female probabilty of response.
since,beta2 is positive3.48 hence given probability is higher than refreance.
(c)
(p/1-p)=exp(-9.8301685 +0.1578414X1+ 3.4849454X2)=exp(eta)
odds ratio=(p/1-p)
odds ratio is increse if one year increase.
(d)pridict response using
pridict command and then plot graph for male and female age separatlly.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.