Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Answers to problem from l. to n. (Solutions a. to d. are available here: https:/

ID: 3274440 • Letter: A

Question

Answers to problem from l. to n.

(Solutions a. to d. are available here: https://www.chegg.com/homework-help/questions-and-answers/disk-file-cancer-contains-values-breast-cancer-mortality-1950--simulate-sampling-distribut-q23715851)

Data:

Col1 Col2
1 445
0 559
3 677
4 681
3 746
4 869
1 950
5 976
5 1096
5 1098
5 1114
7 1125
5 1236
6 1285
3 1291
3 1318
2 1323
8 1327
9 1438
7 1479
4 1536
6 1598
6 1635
11 1667
4 1696
7 1792
7 1795
4 1808
6 1838
16 1838
3 1847
8 1933
8 1959
4 1990
9 2003
10 2070
7 2091
8 2099
5 2104
11 2147
4 2154
12 2163
11 2172
9 2174
13 2183
17 2193
11 2210
10 2212
4 2236
4 2245
8 2261
6 2317
8 2333
16 2393
10 2404
4 2419
11 2462
10 2476
11 2477
9 2483
11 2511
14 2591
6 2624
8 2690
12 2731
15 2735
9 2736
13 2747
18 2782
15 2783
12 2793
11 2891
12 2894
12 2906
14 2929
12 2935
3 2962
5 3054
7 3112
9 3118
11 3185
14 3217
18 3236
11 3290
11 3314
4 3316
13 3401
10 3409
10 3426
9 3470
11 3488
12 3511
4 3549
16 3571
20 3578
5 3620
15 3654
15 3680
12 3683
7 3688
7 3706
12 3733
21 3800
16 3802
13 3832
16 3863
8 3891
12 4008
20 4093
21 4149
15 4162
13 4223
13 4232
10 4312
22 4329
14 4331
16 4399
13 4470
24 4618
27 4669
16 4681
28 4737
11 4784
12 4829
14 4857
26 4918
27 4967
17 5041
20 5051
12 5077
20 5107
12 5108
24 5124
27 5156
25 5167
19 5211
21 5246
8 5743
15 5773
22 5932
21 5983
37 5989
23 5998
18 6021
25 6035
26 6074
17 6134
27 6175
13 6220
13 6296
15 6445
33 6624
24 6841
23 6868
18 6903
24 6904
21 6916
32 6934
23 6978
32 7014
16 7025
29 7031
33 7115
20 7256
19 7288
27 7304
10 7367
34 7376
36 7407
26 7408
33 7503
24 7599
37 7743
34 7760
37 7910
20 7917
28 7957
30 7984
27 8004
45 8208
39 8249
29 8289
22 8313
27 8377
19 8396
30 8468
34 8493
35 8531
21 8773
18 8866
41 9091
34 9215
51 9225
30 9243
32 9435
38 9445
18 9468
42 9563
60 9605
19 9841
29 9994
17 10033
29 10049
41 10144
31 10303
35 10416
27 10461
37 10670
18 10844
41 10875
39 10890
41 11105
61 11622
46 12038
47 12173
36 12181
43 12608
45 12775
46 12915
45 13021
49 13142
55 13206
64 13407
64 13647
66 13870
57 13989
53 14089
51 14197
36 14620
28 14816
59 14952
39 15039
73 15049
41 15179
48 15204
37 16161
72 16239
72 16427
48 16462
62 16793
51 16925
71 17027
60 17201
70 17526
59 17666
91 17692
52 17742
65 18482
77 18731
84 18835
51 19274
66 19818
53 19906
58 20065
75 20140
88 20268
83 20539
48 20639
69 20969
41 21353
73 21757
79 22811
63 23245
90 23258
92 24296
60 24351
63 24692
63 24896
75 25275
70 25405
90 25715
111 26245
103 26408
117 26691
118 28024
40 28270
83 28477
90 29254
97 29422
92 30125
104 30538
96 34109
142 35112
105 35876
145 36307
160 39023
127 40756
169 42997
104 47672
179 49126
152 53464
163 56529
167 59634
302 60161
246 62398
236 62652
250 62931
267 63476
244 66676
248 74005
360 88456

The disk file cancer contains values for breast cancer mortality from 1950 to i. Simulate the sampling distribution of ratio estimators of mean cancer mortal 1960 (y) and the adult white female population in 1960 (x) for 301 counties in North Carolina, South Carolina, and Georgia. a. Make a histogram of the population values for cancer mortality b. What are the population mean and total cancer mortality? What are the pop- ity based on a simple random sample of size 25. Compare this result to that of part (c). Draw a simple random sample of size 25 and estimate the population mean and total cancer mortality by calculating ratio estimates. How do these estimates compare to those formed in the usual way in part (d) from the same data? j. ulation variance and standard deviation? c. Simulate the sampling distribution of the mean of a sample of 25 observations d. Draw a simple random sample of size 25 and use it to estimate the mean and e. Estimate the population variance and standard deviation from the sample of k. Form confidence intervals about the estimates obtained in part (j) 1. Stratify the counties into four strata by population size. Randomly sample six of cancer mortality observations from each stratum and form estimates of the population mean total cancer mortality and total mortality m. Stratify the counties into four strata by population size. What are the sam- part (d). Form 95% confidence intervals for the population mean and total from the sample of part (d). Do the intervals cover the population values? pling fractions for proportional allocation and optimal allocation? Compare the variances of the estimates of the population mean obtained using simple random sampling, proportional allocation, and optimal allocation g. Repeat parts (d) through (f) for a sample of size 100 h. Suppose that the size of the total population of each county is known and that n. How much better than those in part (m) will the estimates of the population mean be if 8, 16, 32, or 64 strata are used instead? this information is used to improve the cancer mortality estimates by forming a ratio estimator. Do you think this will be effective? Why or why not? The data are available at www.stat.berkeley.edu/rice (http://www.cengage.com/cgi wadsworth/course products wp .pl?fid=M20b&product; isbn issn=9780534399429&tokens;) Rice, Chapter 7, problem 65

Explanation / Answer

data1=read.csv(file.choose(),header=T)
attach(data1)

e)

samp.col2=sample(Col2,size = 25,replace = FALSE)
> samp.col2
[1] 20539 15049 3470 5246 8004 1990 3863 10461 20268 18835 4329 7304
[13] 2393 12915 16925 13870 4829 3680 3290 559 14197 47672 5108 3891
[25] 17201

Estimates of the population variance and standard deviation :

> var(samp.col2)
[1] 100028295
> sd(samp.col2)
[1] 10001.41

f)

> mean(samp.col2)
[1] 10635.52
> #95% CI for population mean
> mean(samp.col2)+c(-qt(0.025,24,lower.tail=F)*sd(samp.col2)/sqrt(25), qt(0.025,24,lower.tail=F)*sd(samp.col2)/sqrt(25))
[1] 6507.139 14763.901

95% CI for population total : 162678.475 , 369097.525

No, they don't since the sample even contains values beyond this confidence interval for mean.

g)

samp.col2.1=sample(Col2,size = 100,replace = FALSE)
> samp.col2.1
[1] 3688 7367 9435 13870 3680 16239 15179 950 7304 2099 3316 2511
[13] 1125 8866 2091 5108 53464 8249 6296 7014 62931 7288 2172 8531
[25] 2245 1438 1318 74005 6074 24296 42997 3571 1635 6978 5932 2404
[37] 3511 1114 60161 21757 20969 1933 2174 13021 1323 24896 16462 4162
[49] 3118 88456 10875 19906 3654 5743 6035 1847 2393 26408 7917 559
[61] 10303 11622 6445 10144 36307 1291 8377 11105 7115 2483 16793 7408
[73] 4149 4008 4093 63476 2782 3802 3832 1808 14197 3401 445 2476
[85] 13989 7376 5211 3290 3054 2477 24692 10461 2935 62652 18835 23245
[97] 20639 5124 35112 2962

Estimates of the population variance and standard deviation :
> var(samp.col2.1)
[1] 296489636
> sd(samp.col2.1)
[1] 17218.87
> mean(samp.col2.1)
[1] 12599.76
> #95% CI for population mean
> mean(samp.col2.1)+c(-qt(0.025,24,lower.tail=F)*sd(samp.col2.1)/sqrt(25), qt(0.025,24,lower.tail=F)*sd(samp.col2.1)/sqrt(25))
[1] 5492.158 19707.362

95% CI for population total : 549215.8 , 1970736.2

No, they don't since the sample even contains values beyond this confidence interval for total.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote