You work for a bank as a business data analyst in the credit card risk-modeling

ID: 2757061 • Letter: Y

Question

You work for a bank as a business data analyst in the credit card risk-modeling department. Your bank recently conducted a bold experiment: over a short time interval three years ago, it quietly issued 600 credit cards to everyone who applied, regardless of their credit risk.

After three years, 150, or 25%, of card recipients defaulted – they failed to pay back at least some of the money they owed. However, the bank collected very valuable proprietary data that it can now use to optimize its future card-issuing process.

The bank initially collected six pieces of data about each person.

Age

Years at current employer

Years at current address

Income over the past year

Current credit card debt, and

Current automobile debt

You are first asked to propose a binary classification model for default that uses only data from one or more of the above six inputs, and outputs a single “score.” The relative rank-ordering of scores will determine the model’s effectiveness. For convenience, you are asked to use a scale for your score that has a maximum < 3.5 and a minimum > -3.5.

Initially you are not told what the bank’s best estimate for cost per False Negative (accepted applicant who becomes a defaulting customer) and False Positive (rejected customer who would not have defaulted). Therefore, the best you can do is to design a model that maximizes the Area Under the ROC Curve, or AUC.

You are told that if your model is effective (“high enough” AUC – not defined) and “robust” (not defined, but in general means relatively little change in AUC across multiple sets of available data) that it may be adopted by the bank as a predictive model for default, to determine which future applicants will be issued credit cards.

First Binary Classification Model: You are first given a “training set” of 200 out of the 600 people in the experiment. Design your model on this set. Standardize your data first. You may combine the six inputs by adding them to or subtracting them from each other, taking simple ratios, etc – The only restriction is that your final “score” needs to be scaled so that the maximum is less than 3.5 and the minimum is greater than -3.5, so you can use the Excel “AUC Calculator” provided.

Question 1: What is your model? Give it as a function of the two or more of the six inputs that outputs a single numerical score between -3.5 and 3.5 for each applicant

Question 2: What is your model’s AUC on the Training Set?

Explanation / Answer

Question 1: What is your model? Give it as a function of the two or more of the six inputs that outputs a single numerical score between -3.5 and 3.5 for each applicant

Answer:

logistic regression modeling analysis will use an automatic stepwise procedure, which begins by selecting the strongest candidate predictor, then testing additional candidate predictors, one at a time, for inclusion in the model. At each step, we check to see whether a new candidate predictor will improve the model significantly. We also check to see whether, if the new predictor is included in the model, any other predictors already in the model should stay or be removed. If a newly entered predictor does a better job of explaining loan default behavior, then it is possible for a predictor already in the model to be removed from the model because it no longer uniquely explains enough. This stepwise procedure continues until all the candidate predictors have been thoroughly tested for inclusion and removal. When the analysis is finished, we have the following table that contains various statistics.

Question 2: What is your model’s AUC on the Training Set?

Answer:

I think what you could do is split the training set into train-train and train-test, build your model on train-train, then do the following:

aucAll = GetAUC(myModel, train, train$Happy)

aucTrainTrain = GetAUC(myModel, train-train, train-train$Happy)

aucTrainTest = GetAUC(myModel, train-test, train-test$Happy)

Navigate

You work for a bakery and the owner wants to know whether the customers like dif

You work for a cable company and are designing a software system to automate the

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

You work for a bank as a business data analyst in the credit card risk-modeling

Question

Explanation / Answer

Related Questions

Navigate