Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

R problems opiod <- read.csv(\"https://data.ct.gov/api/views/rybz-nyjw/rows.csv?

ID: 3756588 • Letter: R

Question

R problems

opiod <- read.csv("https://data.ct.gov/api/views/rybz-nyjw/rows.csv?accessType=DOWNLOAD",stringsAsFactors = FALSE)

1.Use the opiod data frame to create a data frame the contains two columns. Column 2 should contain the counts of total deaths in that county. Column 1 should contain the county’s name. You should work with the `Death.County` variable.

2.Add the names “subregion” and “count” to the data frame in 1.

3.Remove the “NOT RECORDED” row from your data frame.

4.Pass your data frame into the function `ct.choropleth` given below. Explain ways to improve the plot to make it more informative.

ct.choropleth <- function(df){

# generate county and state boundaries
ct.state <- map_data("state", region = "connecticut")
ct.county.df <- map_data("county", region = "connecticut")
  
# convert county names to lower case
county.df <- mutate_all(df, funs(tolower))
  
# merge data frames to pass a single data frame to ggplot
choropleth <- inner_join(ct.county.df, county.df, by = "subregion")
  
# convert counts to type numeric
choropleth$count <- as.numeric(choropleth$count)
  
# generate choropleth
ct.plot <- ggplot(choropleth, aes(long, lat, group = group)) +
geom_polygon(aes(fill = count), alpha = 0.75, color = "white") +
geom_polygon(data = ct.county.df, colour = "white", fill = NA) +
geom_polygon(data = ct.state, color = "black", fill = NA)+
scale_fill_gradient2(low = "yellow", mid = "orange", high = "red") +
ggtitle("Opiod deaths in Connecticut by county") +
labs(fill = "Deaths") +
theme_void()

return(ct.plot)
}

Explanation / Answer

Let me know if you have any doubt.


opiod <- read.csv("https://data.ct.gov/api/views/rybz-nyjw/rows.csv?accessType=DOWNLOAD",stringsAsFactors = FALSE)


opiod$Death.County<-gsub(" FAIRFIELD", "NOT RECORDED", opiod$Death.County)

opiod[opiod$Death.County=="",]$Death.County<-'NA'
opiod$Death.County<-gsub("NA", "NOT RECORDED", opiod$Death.County)

opiod$Death.County<-gsub("USA", "NOT RECORDED", opiod$Death.County)

#1
library('dplyr')
df<-opiod %>%
group_by(Death.County) %>%
summarise(n = n())  

#2
names(df)<-c('subregion','count')

#3
df<-df[!df$subregion=='NOT RECORDED',]

#4
library('ggplot2')

ct.choropleth <- function(df){
  
# generate county and state boundaries
ct.state <- map_data("state", region = "connecticut")
ct.county.df <- map_data("county", region = "connecticut")
  
# convert county names to lower case
county.df <- mutate_all(df, funs(tolower))
  
# merge data frames to pass a single data frame to ggplot
choropleth <- inner_join(ct.county.df, county.df, by = "subregion")
  
# convert counts to type numeric
choropleth$count <- as.numeric(choropleth$count)
  
# generate choropleth
ct.plot <- ggplot(choropleth, aes(long, lat, group = group)) +
geom_polygon(aes(fill = count), alpha = 0.75, color = "white") +
geom_polygon(data = ct.county.df, colour = "white", fill = NA) +
geom_polygon(data = ct.state, color = "black", fill = NA)+
scale_fill_gradient2(low = "yellow", mid = "orange", high = "red") +
ggtitle("Opiod deaths in Connecticut by county") +
labs(fill = "Deaths") +
theme_void()
  
return(ct.plot)
}

ct.choropleth(df)

Ways to improve graph:

> We are not able to see the countries name in this existing plot. So, Adding countries legend would be nice.

> Adding actual numbers would be more informative country wise.