Do in R - Find predicated value using analysis of this data set Please do this i
ID: 3360547 • Letter: D
Question
Do in R - Find predicated value using analysis of this data set
Please do this in R.
Consumer Research is an independent agency that conducts research on consumer attitudes and behaviors for a variety of firms. In this study, a client asked for an investigation of the consumer characteristics that can be used to predict the amount charged by credit card users. Data from 2002 info were collected on the annual income (in 000’s), household size and annual credit card charges for a sample of consumers.
Predict the annual credit card charge for a 3-person household with an annual income of $50,000. Also find confidence interval and prediction interval, if you can.
Consumer data set
data Consumer ;
input income housesize amtcharged;
datalines;
55.00 3.00 4116.00
31.00 2.00 3159.00
32.00 4.00 5100.00
51.00 5.00 4742.00
31.00 2.00 1864.00
55.00 2.00 4070.00
37.00 1.00 2731.00
40.00 2.00 3348.00
66.00 4.00 4764.00
51.00 3.00 4110.00
25.00 3.00 4208.00
48.00 4.00 4219.00
27.00 1.00 2477.00
33.00 2.00 2514.00
65.00 3.00 4214.00
63.00 4.00 4965.00
42.00 6.00 4412.00
21.00 2.00 2448.00
44.00 1.00 2995.00
37.00 5.00 4171.00
62.00 6.00 5678.00
21.00 3.00 3623.00
55.00 7.00 5301.00
42.00 2.00 3020.00
41.00 7.00 4828.00
54.00 6.00 5573.00
30.00 1.00 2583.00
48.00 2.00 3866.00
34.00 5.00 3586.00
67.00 4.00 5037.00
50.00 2.00 3605.00
67.00 5.00 5345.00
55.00 6.00 5370.00
52.00 2.00 3890.00
62.00 3.00 4705.00
64.00 2.00 4157.00
22.00 3.00 3579.00
29.00 4.00 3890.00
39.00 2.00 2972.00
35.00 1.00 3121.00
39.00 4.00 4183.00
54.00 3.00 3730.00
23.00 6.00 4127.00
27.00 2.00 2921.00
26.00 7.00 4603.00
61.00 2.00 4273.00
30.00 2.00 3067.00
22.00 4.00 3074.00
46.00 5.00 4820.00
66.00 4.00 5149.00
60.00 4.00 5002.00
32.00 3.00 3100.00
;
proc print;
run;
Explanation / Answer
# Multiple Linear Regression
# Importing the dataset
dataset = read.csv('chegg.csv')
dataset$income = dataset$income*1000
dataset[53,] = c(50000,3,0)
dataset = dataset[-53,]
summary(dataset)
d = as.data.frame(scale(dataset))
# install.packages('caTools')
library(caTools)
set.seed(123)
split = sample.split(d$amtcharged, SplitRatio = 0.8)
training_set = subset(d, split == TRUE)
test_set = subset(d, split == FALSE)
rm(d)
# Feature Scaling
# Fitting Multiple Linear Regression to the Training set
regressor = lm(formula = amtcharged ~ .,
data = training_set)
# Predicting the Test set results
y_pred = predict(regressor, newdata = test_set[-3])
plot(y_pred, test_set$amtcharged)
#Predict the annual credit card charge for a 3-person household
#with an annual income of $50,000. Also find confidence interval and prediction interval,
#if you can.
newdata = rbind(dataset, data.frame(income = 50000, housesize = 3, amtcharged = NA))
newdata = as.data.frame(scale(newdata))
newdata = newdata[53,]
predVal = predict(regressor, newdata = newdata[-3])
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.