Statistics using R We will use the data set described below repeatedly throughou
ID: 3327511 • Letter: S
Question
Statistics using R
We will use the data set described below repeatedly throughout the course. I recommend you save your work in an R script file each time you work with this data.
A data set describing the sale of individual residential property in Ames, Iowa from 2006 to 2010 was obtained by Dean De Cock, a statistics professor at Truman State University. The data set contains 2930 observations and a large number of explanatory variables involved in assessing home values. Source: http://www.amstat.org/publications/jse/v19n3/decock.pdf
This semester we will look at a sample of 200 homes from this data set. These homes are all located in the Sawyer neighborhood of the city. Observations include the following eight variables:
• lot_shape: Lot Shape
o Reg = Regular
o IRR = Irregular • lot_config: Lot configuration
o Inside = Inside lot
o Corner = Corner lot • Style
o Yes = Home has one story
o No = Home has more than one story
• roof_style: Type of Roof
o Gable = Gable
o Hip = Hip
• garage_area : Size of garage in square feet
• lot_area: Lot size in square feet
• living_area: Total home living area in square feet (including unfinished square footage)
• sale_price: Sale price in dollars
Access the data for this problem using the command
sawyer<-read.csv("http://www.math.usu.edu/cfairbourn/Stat2300/RStudioFiles/data/sawyer.csv")
Instructions Watch the video demonstrating how to calculate confidence intervals in RStudio. For each question below, include your R code and the output. NOTE: You must have the mosaic package active in your R session for the prop.test command to work as shown in the videos.
3. [1] Calculate a 95% confidence interval for the proportion of homes in Sawyer that have hip style roofs.
4. [1] Calculate a 99% confidence interval for the proportion of homes in Sawyer that are on inside lots.
5. [1] Calculate a 95% confidence interval for the mean garage area of homes in Sawyer.
6. [1] Calculate a 90% confidence interval for the mean sale price of homes in Sawyer.
Explanation / Answer
The complete R snippet is as follows
sawyer<-read.csv("http://www.math.usu.edu/cfairbourn/Stat2300/RStudioFiles/data/sawyer.csv")
#a)
roof = na.omit(sawyer$roof_style)
n = length(roof)
k = sum(roof == "Hip")
pbar = k/n; pbar
SE = sqrt(pbar(1pbar)/n); SE # standard error
E = qnorm(.975)SE; E
pbar + c(E, E)
#b)
lot = na.omit(sawyer$lot_config)
n = length(lot)
k = sum(lot == "Inside")
pbar = k/n; pbar
SE = sqrt(pbar(1pbar)/n); SE # standard error
E = qnorm(1-0.01/2)SE; E
pbar + c(E, E)
#c)
mean(sawyer$garage_area) + qnorm(1-0.05/2)*sd(sawyer$garage_area)/sqrt(length(sawyer$garage_area))
mean(sawyer$garage_area) - qnorm(1-0.05/2)*sd(sawyer$garage_area)/sqrt(length(sawyer$garage_area))
#d)
mean(sawyer$sale_price) + qnorm(1-0.1/2)*sd(sawyer$sale_price)/sqrt(length(sawyer$sale_price))
mean(sawyer$sale_price) - qnorm(1-0.1/2)*sd(sawyer$sale_price)/sqrt(length(sawyer$sale_price))
###################################3
The results are
> pbar + c(E, E)
[1] 0.7045497 0.8554503
> pbar + c(E, E)
[1] 0.7045497 0.8554503
> mean(sawyer$garage_area) + qnorm(1-0.05/2)*sd(sawyer$garage_area)/sqrt(length(sawyer$garage_area))
[1] 461.9328
> mean(sawyer$garage_area) - qnorm(1-0.05/2)*sd(sawyer$garage_area)/sqrt(length(sawyer$garage_area))
[1] 414.4972
> mean(sawyer$sale_price) + qnorm(1-0.1/2)*sd(sawyer$sale_price)/sqrt(length(sawyer$sale_price))
[1] 164883.1
> mean(sawyer$sale_price) - qnorm(1-0.1/2)*sd(sawyer$sale_price)/sqrt(length(sawyer$sale_price))
[1] 154374.1
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.