Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Biological data analysis All work must be done in R programming To illustrate ho

ID: 3180825 • Letter: B

Question

Biological data analysis

All work must be done in R programming

To illustrate how common sense led scientists to adopt good design for their experiments in the days before statistical methods had been developed properly and applied to the problems of design, Fisher (The Design of Experiments, Oliver and Boyd, 1960, p. 27) quoted the case of Charles Darwin who, in 1876, reported a series of experiments in which he had compared the growth of cross-fertilized plants with that of self-fertilized ones. Darwin intelligently matched pairs of plants, and grew them under exactly the same experimental conditions. However, the interpretation of the results caused him some concern. He wrote –

I long doubted whether it was worth while to give the measurements of each separate plant, but have decide to do so, in order that it may be seen that the superiority of the cross-fertilized plants over the self-fertilized ones des not commonly depend on the presence of two or three extra fine plants on the one side, or a few very poor plants on the other side … As only a moderate number of cross- and self- fertilized plants were measured, it was of great importance to me to learn how far the averages were trustworthy … I may premise that if we took by chances a dozen or score of men belonging to two nations and measured them, it would I presume be very rash to form any judgment from such a small numbers on their average national heights.

Darwin thus intuitively recognized that the results must be judged by the consistency of the superiority of one group over the other, and not merely on the difference between the averages. He therefore turned to the help of his cousin Galton, who was the foremost statistician of his era, but received the reply, “I doubt after making many tests, whether it is possible to derive useful conclusions from these few observations. We ought to have at least 50 plants in each case, in order to be in a position to deduce fair results.” Imagine Darwin’s disappointment – his experiments had taken him 11 years , and still he had not produced sufficient evidence!

Thanks to the completeness of Darwin’s reporting however, we are nowadays able to give a statistical verdict to his work. In one experiment, with 15 matched pairs of maize pants, the results were as follows –

Height of cross-fertilized plant (inches)

Height of comparable self-fertilized plant (inches)

Difference (eights of an inc)

23 4/8

17 3/8

49

12

20 3/8

-67

21

20

8

22

20

16

19 1/8

18 3/8

6

21 4/8

18 5/8

23

22 1/8

18 5/8

28

20 3/8

15 2/8

41

18 2/8

16 4/8

14

21 5/8

18

29

23 2/8

16 2/8

56

21

18

24

22 1/8

12 6/8

75

23

15 4/8

60

12

18

-48

Apply Wilcoxon’s Signed Rank Test to Darwin’s data. How does the answer compared with that obtained by the t Test? [Hint – In the Wilcox case use wilcox.test with paired=TRUE]

Height of cross-fertilized plant (inches)

Height of comparable self-fertilized plant (inches)

Difference (eights of an inc)

23 4/8

17 3/8

49

12

20 3/8

-67

21

20

8

22

20

16

19 1/8

18 3/8

6

21 4/8

18 5/8

23

22 1/8

18 5/8

28

20 3/8

15 2/8

41

18 2/8

16 4/8

14

21 5/8

18

29

23 2/8

16 2/8

56

21

18

24

22 1/8

12 6/8

75

23

15 4/8

60

12

18

-48

Explanation / Answer

Answer:

R code:

x<-c(188,96,168,176,153,172,177,163,146,173,186,168,177,184,96)

y<-c(139,163,160,160,147,149,149,122,132,144,130,144,102,124,144)

wilcox.test(x,y, paired=TRUE)

t.test(x,y, paired=TRUE)

R Out Put:

Wilcoxon signed rank test

data: x and y

V = 96, p-value = 0.04126

alternative hypothesis: true location shift is not equal to 0

Calculated P=0.04126 is < 0.05 level. We reject the null hypothesis.

> t.test(x,y, paired=TRUE)

        Paired t-test

data: x and y

t = 2.148, df = 14, p-value = 0.0497

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

0.03119332 41.83547335

sample estimates:

mean of the differences

               20.93333

P value in t test is 0.0497 which also < 0.05 level. We get the same conclusion.