As you can see i have the problem solved below but i just need answers for step
ID: 3371982 • Letter: A
Question
As you can see i have the problem solved below but i just need answers for step 9 - 11 can someone please help. Just help for step 9, 10, and 11 please
Background
Sugarcane cultivation requires a tropical or subtropical climate, with a minimum of 24 inch of annual moisture. Sugarcane is one of the most efficient photosynthesizers in the plant kingdom, able to convert up to 2 percent of incident solar energy into biomass.1 There are 4 growth stages for the sugarcane development cycle: Germination (1month), Tillering Phase (3 month), Grand Growth
period (5 months), and Ripening Phase (last 3 months).
Photosynthesis and respiration are one of the major plant functions. Photosynthesis is the process by which plants transform the radiant energy of the sun into chemical energy stored in carbohydrates (sugar). It requires sunlight and water.
Respiration is the process in which the stored chemical energy is released to the plant for its various functions such as growth.
The growth diagram on the right demonstrates which months contribute to the actual growth of the sugarcane stem and leafs. The growth requires energy, so the plant performs respiration function to burn sugar to get energy. Therefore, the Ripening Phase is the major determinant of the sugar contents in the sugarcane, since during this stage the conversion from simple sugars to cane sugar takes place and little energy is burned off during respiration as little growth occurs.2 The months of April, May and June correspond to the ripening stage.
Motivation for the Study
The sugar yield determines the efficiency of the crops. Thus sugar yield can help predict farmers’ revenueand costs. Furthermore, in light of recent bio-energy developments, sugarcane emerged as the most efficient plant for the production of ethanol, the alternative to gasoline. Since sugar is the main ingredient in the production of ethanol, knowing the sugar yield from sugarcane would essentially allow forecasting the amount of ethanol produced from a sugarcane farm.
Since sugar creation depends on the photosynthesis process, the sugar yield should be related to the precipitation (water) and temperature (sunlight). The goal of the study is to attempt to quantify the effects as well as compare the yields between the states: Florida and Louisiana.
Data
The data file contains the following information:
State – the state where the sugarcane is grown
Sugar_yield_acres – sugar yield in tons per acre for a given state and year (it is the average of yields across
all sugar cane farms)
Crop_Year: The year when the crop was collected
Average_Temp – average temperature in Fahrenheit for the three months (April, May and June)
Average_Precip – average precipitation in inches for the three months (April, May and June)
1 http://en.wikipedia.org/wiki/Sugarcane2 http://www.sugarcanecrops.com/
Exhibit 1 - Growth of Sugarcane
1
Analysis Steps
Step 1. Create histogram of Sugar Yield for Florida. What are your observations?
Step 2. Compute the average sugar yield per acre for Florida. Compute sample standard deviation of sugar yield per acre for Florida.
Step 3. Your assistant tells you that based on the data collected for Louisiana, the sample average sugar yield is 3.10 tons/acres, sample standard deviation is 1.25 tons/acres. Your assistant tells you that the information is based on the sample size of 10. Test the hypothesis that the average sugar yield per acre in Florida is the same as the average sugar yield per acre in Louisiana. Use alpha 5%.
Step 4. Try to explain the results in step 3 and offer a possible business/life explanation
Step 5. Create scatter plots for sugar yield per acre vs. average temperature, vs. average precipitation and vs. crop year for Florida. Find the correlation for each case and describe the trends. Do trends make sense, why or why not?
Step 6. Run the regression line for sugar yield vs. average temperature for Florida. What is its R2 and adjusted R2?
Step 7. Run a regression of sugar yield vs. average temperature and crop year for Florida. Compare R2 and adjusted R2 to the ones in Step 6. Does the additional variable, crop year, improve the model? How do you think it can be interpreted (Hint: Do you think agricultural practices change with time or remain the same?)
Step 8. Run a regression of sugar yield vs. average temperature, crop year and average precipitation for Florida. Compare R2 and adjusted R2 to the ones in Step 7. Does the additional variable, average precipitation, improve the model relative to the results you saw in Step 7? Is coefficient for average precipitation statistically significant at alpha 10%? Can you think of a reason why that is the case (think how farming may have evolved; do farmers depend on the rain for water)?
Step 9. State which of the three different models Model from steps 6, 7, or 8, you intend to use to predict sugar yield for Florida and why?
Step 10. Using the model you have chosen in Step 9, predict the sugar yield per acre for Florida for year 27. You are told that the average temperature for the year 27 for Florida was 76, and average precipitation was 3.29. Compute the error of your prediction assuming that the actual yield for Florida in the year 27 was 4.48? Express error as a percent of actual value.
Step 11. What are some of the warnings to the users of your models? Can we conclude causality? Why or why not? Can we confidently predict results for the year 100? Why or why not?
Expert Answer
Was this answer helpful?
0
0
819 answers
Step 1
From the histogram, we observe that the distribution is positvely skewed.
Step 2:
Variable Mean StDev
Sugar_yield_acres (tons/ 4.203 0.572
From the above scatter plots we observe that sugar yield per acre and average temperature are positvely correlated, sugar yield per acre and crop year are also positive correlated. However sugar yield per acre and average precipitation are not related.
Pearson correlation of Sugar yield acres (tons/acre) and Crop Year = 0.756
Pearson correlation of Sugar yield acres (tons/acre) and Average Temp (F) =0.655
Pearson correlation of Sugar yield acres (tons/acre) and AveragePrecip (inch) = 0.082
Step 6
Regression Analysis: Sugar_yield_acre versus Average_Temp (F)
The regression equation is
Sugar_yield_acres (tons/acre) = - 16.8 + 0.279 Average_Temp (F)
Predictor Coef SE Coef T P
Constant -16.757 4.933 -3.40 0.002
Average_Temp (F) 0.27853 0.06555 4.25 0.000
S = 0.441056 R-Sq = 42.9% R-Sq(adj) = 40.6%
Analysis of Variance
Source DF SS MS F P
Regression 1 3.5124 3.5124 18.06 0.000
Residual Error 24 4.6687 0.1945
Total 25 8.1811
Step 7
Regression Analysis: Sugar_yield_acre versus Average_Temp (F), Crop_Year
The regression equation is
Sugar_yield_acres (tons/acre) = - 11.0 + 0.194 Average_Temp (F)
+ 0.0454 Crop_Year
Predictor Coef SE Coef T P
Constant -11.037 3.434 -3.21 0.004
Average_Temp (F) 0.19438 0.04608 4.22 0.000
Crop_Year 0.045396 0.008107 5.60 0.000
S = 0.293072 R-Sq = 75.9% R-Sq(adj) = 73.8%
Analysis of Variance
Source DF SS MS F P
Regression 2 6.2056 3.1028 36.12 0.000
Residual Error 23 1.9755 0.0859
Total 25 8.1811
Yes, the additional variable, crop year, improve the model since the adjusted R-Sq(adj) = 73.8% which is higher than previous model whose R-Sq(adj) = 40.6%
Step 8
Regression Analysis: Sugar_yield_ versus Average_Temp, Crop_Year, ...
The regression equation is
Sugar_yield_acres (tons/acre) = - 11.5 + 0.203 Average_Temp (F)
+ 0.0454 Crop_Year
- 0.0429 Average_Precip (inch)
Predictor Coef SE Coef T P
Constant -11.521 3.507 -3.29 0.003
Average_Temp (F) 0.20346 0.04768 4.27 0.000
Crop_Year 0.045447 0.008163 5.57 0.000
Average_Precip (inch) -0.04288 0.05186 -0.83 0.417
S = 0.295108 R-Sq = 76.6% R-Sq(adj) = 73.4%
Analysis of Variance
Source DF SS MS F P
Regression 3 6.2652 2.0884 23.98 0.000
Residual Error 22 1.9160 0.0871
Total 25 8.1811
For this model R-Sq(adj) = 73.4% wheras for the model in Step 7 adjusted R-Sq(adj) = 73.8% .
The additional variable, average precipitation, does not improve the model relative to the results you saw in Step 7. coefficient for average precipitation is not statistically significant at alpha 10%, since p-value=0.417>0.1.
Comment
Step 1. Create histogram of Sugar Yield for Florida. What are your observations?
Step 2. Compute the average sugar yield per acre for Florida. Compute sample standard deviation of sugar yield per acre for Florida.
Step 3. Your assistant tells you that based on the data collected for Louisiana, the sample average sugar yield is 3.10 tons/acres, sample standard deviation is 1.25 tons/acres. Your assistant tells you that the information is based on the sample size of 10. Test the hypothesis that the average sugar yield per acre in Florida is the same as the average sugar yield per acre in Louisiana. Use alpha 5%.
Step 4. Try to explain the results in step 3 and offer a possible business/life explanation
Step 5. Create scatter plots for sugar yield per acre vs. average temperature, vs. average precipitation and vs. crop year for Florida. Find the correlation for each case and describe the trends. Do trends make sense, why or why not?
Step 6. Run the regression line for sugar yield vs. average temperature for Florida. What is its R2 and adjusted R2?
Step 7. Run a regression of sugar yield vs. average temperature and crop year for Florida. Compare R2 and adjusted R2 to the ones in Step 6. Does the additional variable, crop year, improve the model? How do you think it can be interpreted (Hint: Do you think agricultural practices change with time or remain the same?)
Step 8. Run a regression of sugar yield vs. average temperature, crop year and average precipitation for Florida. Compare R2 and adjusted R2 to the ones in Step 7. Does the additional variable, average precipitation, improve the model relative to the results you saw in Step 7? Is coefficient for average precipitation statistically significant at alpha 10%? Can you think of a reason why that is the case (think how farming may have evolved; do farmers depend on the rain for water)?
Step 9. State which of the three different models Model from steps 6, 7, or 8, you intend to use to predict sugar yield for Florida and why?
Step 10. Using the model you have chosen in Step 9, predict the sugar yield per acre for Florida for year 27. You are told that the average temperature for the year 27 for Florida was 76, and average precipitation was 3.29. Compute the error of your prediction assuming that the actual yield for Florida in the year 27 was 4.48? Express error as a percent of actual value.
Step 11. What are some of the warnings to the users of your models? Can we conclude causality? Why or why not? Can we confidently predict results for the year 100? Why or why not?
STATE Sugar_yield_acres (tons/acre) Crop_Year Average_Temp (F) Average_Precip (inch) Florida 3.47 1 74.80 4.78 Florida 3.98 2 76.90 2.85 Florida 3.71 3 75.07 6.87 Florida 3.39 4 73.13 5.27 Florida 3.80 5 73.93 4.77 Florida 3.69 6 75.50 4.12 Florida 3.78 7 74.47 3.49 Florida 3.77 8 73.77 3.43 Florida 3.88 9 73.97 3.01 Florida 3.30 10 75.07 5.00 Florida 4.31 11 75.60 3.68 Florida 5.05 12 78.50 6.94 Florida 3.89 13 73.57 5.62 Florida 4.17 14 73.53 3.19 Florida 4.00 15 76.20 6.00 Florida 4.25 16 76.00 5.48 Florida 4.03 17 74.63 4.83 Florida 4.30 18 74.00 5.83 Florida 5.52 19 76.93 3.75 Florida 4.44 20 75.43 5.04 Florida 4.72 21 75.27 3.05 Florida 4.45 22 74.93 3.98 Florida 4.82 23 76.60 5.03 Florida 5.14 24 76.00 6.11 Florida 4.40 25 74.97 4.11 Florida 5.01 26 77.75 5.12Explanation / Answer
Step 9
Of the three different models, we will use the model derived in Step 7. This is because model derived in Step 7 has the highest R-Sq (Adj.) value (73.8%). We have considered value of R-Sq (Adj.) and not R-Sq because value R-Sq increases with every predictor added to a model. As R-Sq always increases and never decreases, it can appear to be a better fit with the more terms you add to the model. However, this can be completely misleading. On the other hand, R-Sq (Adj.) will only increase if you add more useful variables. If you add more and more useless variables to a model, adjusted r-squared will decrease.
Step 10
Regression equation for model derived in Step 7 is as follows:
Sugar_yield_acres (tons/acre) = - 11.0 + 0.194 Average_Temp (F) + 0.0454 Crop_Year
Now, it is given that Average_Temp = 76 and Crop_Year = 27. So Sugar_yield_acres (tons/acre) can be obtained by substituting these values in the regression equation:
Sugar_yield_acres (tons/acre) = - 11.0 + 0.194*76 + 0.0454*27
Sugar_yield_acres (tons/acre) = 4.97
The actual yield is 4.48
% Error = (Actual - Predicted)*100/Actual Value
% Error = |4.48-4.97|*100/4.48
% Error = 10.94%
Step 11
Regression deals with dependence amongst variables within a model. But it cannot always imply causation. For example, we can state that rainfall affects crop yield and there is data that support this. However, this is a one-way relationship: crop yield cannot affect rainfall. It means there is no cause and effect reaction on regression if there is no causation. In short, we conclude that a statistical relationship does not imply causation.
We canot confidently predict results for the year 100 because crop year is outside the data range based on which regression equation is derived.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.