Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Logistic regression. The data le harrell.csv contains data on 40 people. The var

ID: 3312419 • Letter: L

Question

Logistic regression. The data le harrell.csv contains data on 40 people. The variables are
age in years, gender, a categorical variable with two levels, female and male, and response, a 0/1
indicator variable for whether the person responded to a medical treatment (1 means that the person
responded). Fit a logistic regression model with response as the dependent variable and age and
gender as the independent variables.
(a) How does the probability of response change for a 42-year-old male compared to a 52-year-old
male?
(b) Which gender has a higher probability of response to the medical treatment?
(c) What is the e ect on the odds of response for a one-year increase in age?
(d) Make a plot of the probability of response as a function of age, with one curve for females and
one curve for males.

age gender response 37 female 0 39 female 0 39 female 0 42 female 0 47 female 0 48 female 0 48 female 1 52 female 0 53 female 0 55 female 0 56 female 0 57 female 0 58 female 0 58 female 1 60 female 0 64 female 0 65 female 1 68 female 1 68 female 1 70 female 1 34 male 1 38 male 1 40 male 0 40 male 0 41 male 0 43 male 1 43 male 1 43 male 1 44 male 0 46 male 0 47 male 1 48 male 1 48 male 1 50 male 0 50 male 1 52 male 1 55 male 1 61 male 1 61 male 1 61 male 1

Explanation / Answer

Rstudio-code

data=read.table("D:\data.txt",header=TRUE)
data
X1=data[,1]
X1
X2=as.factor(data[,2])
X2
Y=data[,3]
Y
lm.fit=glm(Y~X1+X2,family=binomial(link="logit"),)
lm.fit
beta=lm.fit$coefficients
beta

Routput-

data=read.table("D:\data.txt",header=TRUE)

> data

age gender response

1 37 0 0

2 39 0 0

3 39 0 0

4 42 0 0

5 47 0 0

6 48 0 0

7 48 0 1

8 52 0 0

9 53 0 0

10 55 0 0

11 56 0 0

12 57 0 0

13 58 0 0

14 58 0 1

15 60 0 0

16 64 0 0

17 65 0 1

18 68 0 1

19 68 0 1

20 70 0 1

21 34 1 1

22 38 1 1

23 40 1 0

24 40 1 0

25 41 1 0

26 43 1 1

27 43 1 1

28 43 1 1

29 44 1 0

30 46 1 0

31 47 1 1

32 48 1 1

33 48 1 1

34 50 1 0

35 50 1 1

36 52 1 1

37 55 1 1

38 61 1 1

39 61 1 1

40 61 1 1

> X1=data[,1]

> X1

[1] 37 39 39 42 47 48 48 52 53 55 56 57 58 58 60 64 65 68 68 70 34 38 40 40 41 43 43

[28] 43 44 46 47 48 48 50 50 52 55 61 61 61

> X2=as.factor(data[,2])

> X2

[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Levels: 0 1

> Y=data[,3]

> lm.fit=glm(Y~X1+X2,family=binomial(link="logit"),)

> lm.fit

Call: glm(formula = Y ~ X1 + X2, family = binomial(link = "logit"))

Coefficients:

(Intercept) X1 X21  

-9.8302 0.1578 3.4849  

Degrees of Freedom: 39 Total (i.e. Null); 37 Residual

Null Deviance: 55.45

Residual Deviance: 38.9 AIC: 44.9

> beta=lm.fit$coefficients

> beta
(Intercept) X1 X21
-9.8301685 0.1578414 3.4849454

(1)

approprite fitting model,

log(p/1-p)=-9.8301685 +0.1578414X1+ 3.4849454X2=eta

p/(1-p)=exp(eta)

(1-p)/p=1/exp(eta)

1/p=(1/exp(eta))+1

1/p=(1+exp(eta))/exp(eta)

p=exp(eta)/(1+exp(eta))

Where, X2 =1 if male and 0 if female

p=exp(-9.8301685 +0.1578414X1+ 3.4849454X2/(1+exp(-9.8301685 +0.1578414X1+ 3.4849454X2))

hence for male;

at age of 42

p=0.5705549

at age of 52 p=0.865591

Hence, as age increase probability of response also increase.

(b)

p=exp(-9.8301685 +0.1578414X1+ 3.4849454X2/(1+exp(-9.8301685 +0.1578414X1+ 3.4849454X2))

where x2=1 if male 0 if female

for male probability of response is higher then female probabilty of response.

since,beta2 is positive3.48 hence given probability is higher than refreance.

(c)

(p/1-p)=exp(-9.8301685 +0.1578414X1+ 3.4849454X2)=exp(eta)

odds ratio=(p/1-p)

odds ratio is increse if one year increase.

(d)pridict response using

pridict command and then plot graph for male and female age separatlly.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote