Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

(PLEASE ANSWER BY R STUDIO)) # Download the data file hmda_small . # folder and

ID: 3792921 • Letter: #

Question

(PLEASE ANSWER BY R STUDIO))

# Download the data file hmda_small .

# folder and read the data into R using the read.csv() function.

Be sure to assign the data to an object called hmda when you read it in.

Here are the codes :

data_url <-

"https://raw.githubusercontent.com/wampeh1/Ecog314_Spring2017/master/lecture4/data/hmda_small.csv"

if ( ! file.exists("hmda_small.csv") ) {

download.file(data_url, "hmda_small.csv")

}

hmda_small <- read.csv("hmda_small.csv", stringsAsFactors = FALSE)

# Part 6: a. How many loans in the data set are for each of the following:
# "Home improvement","Refinancing","Home purchase"   
# b. How many "Application denied by financial institution" observations were there?
# What fraction of total observations is this?
#

# Part 7: Using the ifelse() create a new variable called race_ethnicity that is the variable
# applicant_ethnicity_name when applicant_ethnicity_name is equal to "Hispanic or Latino"
# and applicant_race_name_1 otherwise. Calculate summary statistics of the loan variable
# for each racial and ethnic group. Which group has the largest average loan? Which group
# has the largest standard deviation in loan amount?

Explanation / Answer

The universal logic to get the solution for the above mentioned question is as follows:

After downloading the csv file, you will have to read each line in a csv file.

Part 6 : A : Algorithm

Step 1: Identify the specific term, in our case such as loan_purpose_name which drops down to all the loan datasets such as Home Improvement, Refinancing and Home purchase. ( Code version : Using two for loops such as for line1 in line corresponding to nth row and nth column at every Iteration of the datasets)

Step 2 : Identify the loan datasets viz Home Improvement, Refinancing and Home purchase and find out the corresponding loan purpose values (Code version : Using if(a == "Home improvements") ,likewise with others, it is checked and obtaining its correspondent values which is on previous column of the same row)

Step 3 : Add those corresponding values everytime when it is iterated and found subsequent rows (Code version : Using a counter variable.

Step 4 : Implement this algorithm for all the loan datasets, likewise (Code version: Using if, elsif and else)

B.1 : Algorithm

Step 1 : Identify the specific term and address to those rows using for line1 in line

Step 2 : Check how many times the Application denied by financial institution by comparing the statament to every column using if statement.

Step 2 : And find out how many times has it appeared using counter variable.

Step 3: Store the value in a variable such as A.

B.2 Algorithm

Step 1 : Iterate the column through all the rows and for every iteration add the counter variable to give total number of observations.

Step 2 : Store the result in a variable such as B.

Step 3 : Perform the operation B/A i.e A is to be divided B and there you get the fractions of the Observation.