Activity 2: A fishing lake has only two kinds of fish: bass and catfish. Managem
ID: 3340632 • Letter: A
Question
Activity 2: A fishing lake has only two kinds of fish: bass and catfish. Management wants to know what percent are bass and what percent are catfish.
1. How do two colors simulate this situation? What is the population? (Your instructor will supply the objects that represent the population of fish.)
2. Without looking, draw a sample of 10 fish, record the colors and replace the fish. Is your sample a simple random sample? Explain.
3. Repeat several times. Does each sample have the same number of fish of each type?
4. Based on the evidence gathered, make a prediction about the percent of the fish that are bass and the percent that are catfish.
I do not need questions 1 through 4 answered, I only need "a" and "b" below answered.
I typed out activity 2 so that it would make sense as to what question "a" is talking about.
a) Describe a simulation of estimating the fish population in Activity 2 by using random numbers for the drawing of samples of size 10. Assume that 25% of the fish are bass and the remainder are catfish.
b) Use your simulation to get 5 samples. How do the percents in the samples compare with each other and also with those in the population?
Explanation / Answer
A random number table is designed to create uniformly distributed values; this use is straight forward. I will describe this because it is a useful algorithm for any statistician to know: random permutations are very important (for resampling and bootstrapping, for instance) and, because they may be generated a huge number of times, efficiently generating them can be essential.
Usually, one designates a starting entry in the table and a rule for selecting sequences of digits. For instance, for sampling from 100 fish, first number them from 0 through 99 in any order you like. Without referring to the table, we might stipulate that you will start in row 3, column 1 and pick every other digit, grouping them in pairs. This determines the sequence
89, 10, 58, 44, ...
Via our numbering system we will interpret this sequence as elements of the population.
When N is not a power of 10 you can proceed in several ways. Perhaps the most efficient is to partition the random digits into longish groups, interpret them as values in the interval [0,1) by placing an implicit decimal point in front of them, multiplying those by N, and rounding down.
For example, with N=123 items numbered from 0 through 122 and again starting in row 3 column 1, this time grouping the digits in sixes, we would produce
123 * .859414 = 105, 123 * .075682 = 9, 123 * .414020 = 50, 123 * .156114 = 19, ...
This can be performed in the field without any tools at all
Sampling without replacement
The most straightforward way to accomplish this is simply to remove duplicates as they are encountered in the sequence of indexes generated above. Two tricks are frequently employed when doing this manually (in the field) or by computer with very large populations.
The first one is that when sampling more than half the population, instead identify the elements that are not needed (without replacement), then just keep the rest for the sample.
There is an elegant algorithm to avoid searching for duplicates. It generates a random permutation (an ordered subset of the population of specified size); the ordering is useful in its own right. we begin by writing the identifiers of the population in any sequence you like. As an example, if we were to sample from a population of N=10 of individuals named {a,b,c,d,e,f,g,h,i,j}, we would begin with, say, this array preceded by a set of the elements put into the sample so far (none of them):
{}, a b c d e f g h i j
Generate a sequence of random numbers as before. For this example let's use the previous sequence .859414, .075682, .414020, ... .
Because the array currently has 10 elements in it, we use the first random number from the table to compute the random index 10 * 0.859414 = i. Interpret this as an index*into the array, remembering the array indexes must start at 0. Swap the marked element of the array with the element at this index, keep the new element at the first index for your sample, and then drop the first element from the array altogether, leaving this:
(i), b c d e f g h a j
Repeat, bearing in mind the array now has only N-1 = 9 elements. So we use the second random value from the table to compute 9 * 0.075682 = 0. That happens to designate the first element of the remainder of the array, which is swapped with itself and also put into the growing sample:
(i, b), c d e f g h a j
Another repetition produces the index 8 * .414020 = 3, identifying the value f in the array, which is swapped with the initial c:
(i, b, f), d e c g h a j
we can see that the only actual changes made to the array are the swaps: the random permutation--here (i, b, f)--automatically appears in the first k entries of the array after k swaps are performed in this manner. In this fashion we can generate a random k-permutation by means of just k (uniformly distributed) random numbers and k swaps: a highly efficient procedure. Moreover, in cases where the sample size is not fixed in advance, this procedure can be iterated with the remaining non-sampled elements.
But to prove that all permutations are equally likely to occur in this algorithm,we should note that all elements of the population have equal chances of being the first chosen. The proof is finished by induction (because this is a recursive algorithm)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.