Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

> library(rpart) > dt=rpart(Species~.,iris) > plot(dt);text(dt); > table(predict

ID: 3895302 • Letter: #

Question

> library(rpart)

> dt=rpart(Species~.,iris)

> plot(dt);text(dt);

> table(predict(dt,type="class"),iris$Species)

Try these statements in R. Explain their meanings. Explain the result of the contingency table.

> dt=rpart(Species~.,iris,control=rpart.control(cp=0.0,minsplit=0))

> plot(dt);text(dt);

> table(predict(dt,type="class"),iris$Species)

Explain the meaning of the first statement. Explain the result of the contingency table and its difference with the previous result.

> train_index = c(sample(50,30), sample(50,30)+50, sample(50,30)+100)

> iris_train=iris[train_index,]

> iris_test=iris[-train_index,]

> dt=rpart(Species~.,iris_train)

> plot(dt);text(dt);

> table(predict(dt,newdata=iris_test,type="class"),iris_test$Species)

Explain the meaning of these statements.

> library(randomForest)

> rf=randomForest(Species~., iris, ntree=1000, proximity=TRUE)

> table(predict(rf,type="class"),iris$Species)

Explain the meaning of these statements.

Explanation / Answer

#i am providing the explanations as comments below each line
library(rpart)
# the above line is used to load the library/package in R, we are loading rpart library
dt=rpart(Species~.,iris)
#the above line rpart() is a method which performs classification and regression decision tree
#the iris is a data set which by default is available in R which has 5 coloumns
# Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, Species
#the rpart function gives the decsion tree and is stored in dt
plot(dt);text(dt);
#the plot(dt) now gives the picture of decision tree, the text() add text in picture
table(predict(dt,type="class"),iris$Species)
#the predict() function is used to predict the values. the table() function is used to give results in tabular format
# ie no of correct predictions and incorrect predictions
#ie if actual value is setosa and predicted is setosa then we add 1 in cell [setosa, setosa] which is correct prediction
#if actual value is setosa and predicted is versicolor then we add 1 in cell [setosa, vercicolor] which is incorrect prediction


dt=rpart(Species~.,iris,control=rpart.control(cp=0.0,minsplit=0))
#the above line rpart() is a method which performs classification and regression decision tree with some controls
#minsplit is the minimum number of observations that must exist in a node in order for a split to be attempted.
#cp is the complexity parameter. Any split that does not decrease the overall lack of fit by a factor of cp is not attempted.

plot(dt);text(dt);
#the plot(dt) now gives the picture of decision tree, the text() add text in picture
table(predict(dt,type="class"),iris$Species)
#the predict() function is used to predict the values. the table() function is used to give results in tabular format
# ie no of correct predictions and incorrect predictions
#ie if actual value is setosa and predicted is setosa then we add 1 in cell [setosa, setosa] which is correct prediction
#if actual value is setosa and predicted is versicolor then we add 1 in cell [setosa, vercicolor] which is incorrect prediction


train_index = c(sample(50,30), sample(50,30)+50, sample(50,30)+100)
#in above line the sample(50,30) gives the 30 sampling/random numbers from 0 to 50
#similarly sample(50,30) gives the 30 sampling/random numbers from 50(ie 0+10) to 100(ie 50+50)
#all these are stored in train_index variable
iris_train=iris[train_index,]
#in the above line, from the iris dataset the rows with the row-numbers present in train_index variable are
#stored in iris_train
iris_test=iris[-train_index,]
#in the above line, from the iris dataset the rows with the row-numbers that are not present in train_index variable are
#stored in iris_test
dt=rpart(Species~.,iris_train)
#the above line rpart() is a method which performs classification and regression decision tree on iris_train
plot(dt);text(dt);
#the plot(dt) now gives the picture of decision tree, the text() add text in picture
table(predict(dt,newdata=iris_test,type="class"),iris_test$Species)
#the predict() function is used to predict the values. the table() function is used to give results in tabular format
# ie no of correct predictions and incorrect predictions
#ie if actual value is setosa and predicted is setosa then we add 1 in cell [setosa, setosa] which is correct prediction
#if actual value is setosa and predicted is versicolor then we add 1 in cell [setosa, vercicolor] which is incorrect prediction

library(randomForest)
# the above line laods the randomForest library
rf=randomForest(Species~., iris, ntree=1000, proximity=TRUE)
#the above line performs the randomforset algorithm, which basically builts several decision trees with number of
# sub-samples of inpurt data. here the no of trees are 1000.
# the results of random forest algorithm is stored in rf
table(predict(rf,type="class"),iris$Species)

#the predict() function is used to predict the values. the table() function is used to give results in tabular format
# ie no of correct predictions and incorrect predictions
#ie if actual value is setosa and predicted is setosa then we add 1 in cell [setosa, setosa] which is correct prediction
#if actual value is setosa and predicted is versicolor then we add 1 in cell [setosa, vercicolor] which is incorrect prediction

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote