2) A large number of insurance records are to be examined to develop a model for
ID: 3849906 • Letter: 2
Question
2) A large number of insurance records are to be examined to develop a model for predicting fraudulent claims. Of the claims in the historical database, 1% were judged to be fraudulent (class 1). A sample database is taken to develop a model, and oversampling is used to provide a balanced sample in light of the very low response rate. When applied to this sample database (total number of records, N = 800), the model ends up correctly classifying 310 frauds, and 270 non-frauds. It misses 90 frauds, and classified 130 records incorrectly as frauds when they were not.{ the sample ratio is 1:99 (fraudulent vs. non-fraudulent, positive vs. negative)} If the positive sample number is fixed (400), Find a) what is the total number of records that should be in the original non-oversampled database? b) what is the total number of negative records that should be in the original non-oversampled database? c) what is the total number of false negative records that should be in the original non-oversampled database? d) what is the total number of true negative records that should be in the original non-oversampled database? e) what is the adjusted misclassification rate (error rate) that should be in the original non-oversampled database? f) what is the adjusted positive response rate that should be in the original non-oversampled database?
Explanation / Answer
total number of records in original non-oversampled database= 100*800 = 8000
total number of true negative records in sample = 310 (correctly identified fraud) + 90 (missed) - 130 (incorrectly as frauds) = 270
total number of negative records in non-oversampled = 4000
total number of false negative records that should be in the original non-oversampled database = 1300
total number of true negative records that should be in the original non-oversampled database = 2700
misclassification rate (error rate) = ((8000 - (3100+2700) ) / 8000)*100 = 27.5 %
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.