A bank is interested in identifying different attributes of its customers and be
ID: 3132714 • Letter: A
Question
A bank is interested in identifying different attributes of its customers and below is the sample data of 150 customers. In the data table for the dummy variable Gender, 0 represents Male and 1 represents Female. And for the dummy variable Personal loan, 0 represents a customer who has not taken personal loan and 1 represents a customer who has taken personal loan.Partition the data into training (50 percent), validation (30 percent), and test (20 percent) sets. Use logistic regression to classify observations as Personal loan taken (or not taken) using Age, Gender, Work experience, Income (in 1000 $), and Family size as input variables and Personal loan as the output variable. Perform an exhaustive-search best subset selection with the number of best subsets equal to 2.
a. From the generated set of logistic regression models, select one that you believe is a good fit. Express the model as a mathematical equation relating the output variable to the input variables.
b. Increases in which variables increase the chance of a customer who has taken the personal loan? Increases in which variables decrease the chance of a customer who has not taken the personal loan?
c. Using the default cutoff value of 0.5 for your logistic regression model, what is the overall error rate on the test data?
Age
Gender
Work experience
Income (in 1000 $)
Family size
Personal loan
47
0
22
53
3
1
26
1
3
22
1
1
38
0
16
29
4
1
37
0
12
32
6
1
44
0
22
32
3
0
55
1
30
45
7
0
44
1
23
50
2
0
30
1
5
22
2
0
63
0
35
56
2
0
34
1
8
23
4
0
52
0
26
29
1
1
55
0
25
34
2
1
52
0
28
45
3
1
63
1
29
23
3
1
51
1
30
32
4
0
41
0
18
21
5
1
37
0
14
43
4
1
46
0
23
23
3
1
30
1
6
18
3
1
48
1
25
34
2
0
50
1
22
21
1
1
56
0
31
24
4
0
35
1
9
23
3
1
39
0
13
29
5
1
48
0
22
34
6
0
51
1
21
39
2
1
27
1
3
26
1
1
57
1
32
49
2
1
33
1
12
39
3
1
58
0
33
32
2
0
46
0
21
45
3
1
32
0
6
23
5
0
56
1
28
45
3
1
35
1
12
28
4
1
47
1
23
38
1
1
50
0
23
32
3
0
57
1
25
32
4
1
38
0
15
25
5
1
52
1
24
22
2
1
56
0
31
19
3
1
47
1
24
34
4
0
54
0
31
45
2
1
25
0
2
21
1
1
40
1
16
34
6
0
61
1
30
49
2
1
29
1
6
34
1
1
52
0
25
39
3
1
56
0
31
54
2
1
61
0
33
43
2
0
26
0
4
23
2
1
60
1
30
56
3
0
37
1
12
23
5
0
39
1
14
39
4
0
46
1
21
34
5
0
59
1
30
39
2
0
54
0
31
28
1
1
27
0
4
22
1
1
54
0
30
45
2
0
42
1
18
36
4
0
64
0
35
46
2
0
33
1
8
32
6
1
65
0
34
36
1
0
38
0
13
32
4
0
48
1
23
26
5
0
31
1
7
32
3
0
39
0
15
35
5
1
58
1
28
45
2
0
51
0
26
23
3
0
41
0
18
34
3
0
36
1
12
32
2
0
39
0
12
22
4
0
58
0
29
34
2
0
42
1
15
45
1
0
44
1
21
32
3
0
65
1
35
54
2
0
40
0
13
23
5
0
33
1
11
41
4
0
45
1
24
29
5
0
38
0
12
24
4
1
32
1
10
28
3
1
51
0
30
43
2
1
32
0
11
34
1
1
46
0
22
39
4
1
44
1
21
25
3
0
41
1
15
28
2
1
54
1
29
54
4
1
26
1
3
24
2
0
33
0
8
26
3
1
45
1
20
34
4
0
63
1
30
54
2
0
55
0
31
49
3
1
49
1
25
34
5
1
64
1
35
54
2
0
26
0
3
19
1
0
42
1
19
23
1
1
48
1
22
43
3
1
64
1
34
45
2
0
52
0
28
32
4
0
41
0
16
34
3
1
40
0
14
34
5
0
40
0
16
26
4
0
55
1
32
37
2
1
49
0
27
39
3
0
46
0
21
29
1
0
59
1
32
56
2
0
51
0
26
23
3
1
45
0
23
43
4
1
39
1
17
26
4
1
55
1
32
37
5
1
63
0
35
43
2
0
53
1
27
24
1
1
39
0
16
43
6
1
50
1
25
56
3
0
27
0
4
32
1
1
29
0
6
23
2
1
61
0
30
45
1
0
57
1
27
34
2
1
56
1
34
54
4
0
35
1
6
45
5
0
25
0
3
32
2
1
57
0
27
36
2
0
47
0
25
54
4
1
60
1
35
33
3
0
32
0
8
45
2
1
27
1
4
32
1
0
39
0
18
39
4
1
26
0
1
22
1
0
46
1
25
34
5
0
28
0
4
23
2
0
64
0
37
43
4
0
65
1
33
49
2
0
47
1
25
37
4
0
27
1
5
28
2
1
25
0
3
34
1
1
25
0
2
24
1
1
64
0
30
53
2
0
44
1
21
48
5
1
65
0
25
47
2
0
54
0
31
55
1
1
51
0
26
43
3
0
51
0
21
46
1
1
28
1
4
25
2
1
56
1
32
55
3
1
57
1
26
49
2
1
35
1
9
27
6
1
47
1
21
54
4
1
54
1
27
45
2
0
28
1
5
29
3
1
45
0
22
45
5
0
43
1
18
43
2
1
Age
Gender
Work experience
Income (in 1000 $)
Family size
Personal loan
47
0
22
53
3
1
26
1
3
22
1
1
38
0
16
29
4
1
37
0
12
32
6
1
44
0
22
32
3
0
55
1
30
45
7
0
44
1
23
50
2
0
30
1
5
22
2
0
63
0
35
56
2
0
34
1
8
23
4
0
52
0
26
29
1
1
55
0
25
34
2
1
52
0
28
45
3
1
63
1
29
23
3
1
51
1
30
32
4
0
41
0
18
21
5
1
37
0
14
43
4
1
46
0
23
23
3
1
30
1
6
18
3
1
48
1
25
34
2
0
50
1
22
21
1
1
56
0
31
24
4
0
35
1
9
23
3
1
39
0
13
29
5
1
48
0
22
34
6
0
51
1
21
39
2
1
27
1
3
26
1
1
57
1
32
49
2
1
33
1
12
39
3
1
58
0
33
32
2
0
46
0
21
45
3
1
32
0
6
23
5
0
56
1
28
45
3
1
35
1
12
28
4
1
47
1
23
38
1
1
50
0
23
32
3
0
57
1
25
32
4
1
38
0
15
25
5
1
52
1
24
22
2
1
56
0
31
19
3
1
47
1
24
34
4
0
54
0
31
45
2
1
25
0
2
21
1
1
40
1
16
34
6
0
61
1
30
49
2
1
29
1
6
34
1
1
52
0
25
39
3
1
56
0
31
54
2
1
61
0
33
43
2
0
26
0
4
23
2
1
60
1
30
56
3
0
37
1
12
23
5
0
39
1
14
39
4
0
46
1
21
34
5
0
59
1
30
39
2
0
54
0
31
28
1
1
27
0
4
22
1
1
54
0
30
45
2
0
42
1
18
36
4
0
64
0
35
46
2
0
33
1
8
32
6
1
65
0
34
36
1
0
38
0
13
32
4
0
48
1
23
26
5
0
31
1
7
32
3
0
39
0
15
35
5
1
58
1
28
45
2
0
51
0
26
23
3
0
41
0
18
34
3
0
36
1
12
32
2
0
39
0
12
22
4
0
58
0
29
34
2
0
42
1
15
45
1
0
44
1
21
32
3
0
65
1
35
54
2
0
40
0
13
23
5
0
33
1
11
41
4
0
45
1
24
29
5
0
38
0
12
24
4
1
32
1
10
28
3
1
51
0
30
43
2
1
32
0
11
34
1
1
46
0
22
39
4
1
44
1
21
25
3
0
41
1
15
28
2
1
54
1
29
54
4
1
26
1
3
24
2
0
33
0
8
26
3
1
45
1
20
34
4
0
63
1
30
54
2
0
55
0
31
49
3
1
49
1
25
34
5
1
64
1
35
54
2
0
26
0
3
19
1
0
42
1
19
23
1
1
48
1
22
43
3
1
64
1
34
45
2
0
52
0
28
32
4
0
41
0
16
34
3
1
40
0
14
34
5
0
40
0
16
26
4
0
55
1
32
37
2
1
49
0
27
39
3
0
46
0
21
29
1
0
59
1
32
56
2
0
51
0
26
23
3
1
45
0
23
43
4
1
39
1
17
26
4
1
55
1
32
37
5
1
63
0
35
43
2
0
53
1
27
24
1
1
39
0
16
43
6
1
50
1
25
56
3
0
27
0
4
32
1
1
29
0
6
23
2
1
61
0
30
45
1
0
57
1
27
34
2
1
56
1
34
54
4
0
35
1
6
45
5
0
25
0
3
32
2
1
57
0
27
36
2
0
47
0
25
54
4
1
60
1
35
33
3
0
32
0
8
45
2
1
27
1
4
32
1
0
39
0
18
39
4
1
26
0
1
22
1
0
46
1
25
34
5
0
28
0
4
23
2
0
64
0
37
43
4
0
65
1
33
49
2
0
47
1
25
37
4
0
27
1
5
28
2
1
25
0
3
34
1
1
25
0
2
24
1
1
64
0
30
53
2
0
44
1
21
48
5
1
65
0
25
47
2
0
54
0
31
55
1
1
51
0
26
43
3
0
51
0
21
46
1
1
28
1
4
25
2
1
56
1
32
55
3
1
57
1
26
49
2
1
35
1
9
27
6
1
47
1
21
54
4
1
54
1
27
45
2
0
28
1
5
29
3
1
45
0
22
45
5
0
43
1
18
43
2
1
Explanation / Answer
Here we are given that personal loan is dependent variable and age, gender, work experience, income and family size are independent variables.
We have to fit logistic regression.
Here we use binary logistic regression.
Assumptions of binary logistic regression :
1) Your dependent variable should consist of two categorical, independent (unrelated) groups (i.e., a dichotomousvariable).
2) You have one or more independent variables that are continuous or nominal (including dichotomous variables).
3) You should have independence of observations, which means that there is no relationship between the observations. If you do not have independence of observations, you most likely have repeated measures, and you will need another type of statistical test.
4) There should be no multicollinearity. Multicollinearity occurs when you have two or more independent variables that are highly correlated with each other. This leads to problems with understanding which variable contributes to the explanation of the dependent variable and technical issues in calculating a binomial logistic regression. Determining whether there is multicollinearity is an important step in binomial logistic regression.
5) There needs to be a linear relationship between any continuous independent variables and the logit transformation of the dependent variable.
6) There should be no outliers, high leverage values or highly influential points. These are observations that do not fit the model well in one of several possible ways (e.g., they exert undue influence on the regression model, skewing it unduly towards themselves).
This example satisfied all the assumptions of binary logistic regression.
Therefore we use binary logistic regression.
This we can done using MINITAB.
steps :
Enter all the data in MINITAB sheet --> stat --> Regression --> Binary logistic regression --> Response : Personal loan --> Model : input all the variables except personal loan in model --> Results : select third option --> ok --> Storage : click on delta chi sqaure and delta dev --> ok
Output is :
Binary Logistic Regression: Personal loan versus Age, Gender, ...
Link Function: Logit
Response Information
Variable Value Count
Personal loan 1 78 (Event)
0 72
Total 150
Logistic Regression Table
Odds 95% CI
Predictor Coef SE Coef Z P Ratio Lower Upper
Constant 4.21463 1.65126 2.55 0.011
Age -0.115197 0.0622003 -1.85 0.064 0.89 0.79 1.01
Gender -0.230171 0.343749 -0.67 0.503 0.79 0.40 1.56
Work experience 0.0770283 0.0709593 1.09 0.278 1.08 0.94 1.24
Income (in 1000 $) 0.0067251 0.0199759 0.34 0.736 1.01 0.97 1.05
Family size -0.192966 0.123934 -1.56 0.119 0.82 0.65 1.05
Log-Likelihood = -97.353
Test that all slopes are zero: G = 12.999, DF = 5, P-Value = 0.023
Goodness-of-Fit Tests
Method Chi-Square DF P
Pearson 148.672 143 0.356
Deviance 191.933 143 0.004
Hosmer-Lemeshow 3.719 8 0.882
Table of Observed and Expected Frequencies:
(See Hosmer-Lemeshow Test for the Pearson Chi-Square Statistic)
Group
Value 1 2 3 4 5 6 7 8 9 10 Total
1
Obs 5 4 6 6 7 9 11 9 9 12 78
Exp 4.2 5.6 6.3 6.9 7.3 7.9 8.5 9.2 10.2 11.8
0
Obs 10 11 9 9 8 6 4 6 6 3 72
Exp 10.8 9.4 8.7 8.1 7.7 7.1 6.5 5.8 4.8 3.2
Total 15 15 15 15 15 15 15 15 15 15 150
Measures of Association:
(Between the Response Variable and Predicted Probabilities)
Pairs Number Percent Summary Measures
Concordant 3775 67.2 Somers' D 0.35
Discordant 1804 32.1 Goodman-Kruskal Gamma 0.35
Ties 37 0.7 Kendall's Tau-a 0.18
Total 5616 100.0
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.