Applications of Sampling Distributions and the Central
Limit Theorem
- A university professor has several hundred mice with mean weight of 30 grams and a
standard deviation of 5 grams. The professor instructs his students to go grab 25 mice
haphazardly, without any specific plan. The mean weight of the 25 mice the students
selected was 33 grams.
- Was the sample of mice a random sample?
- If the sample were random, what would you expect the mean weight of the mice in the
sample to be, on average?
- Assume that the sample was random and that the distribution of weight of all mice was
normal. What is the probability that you would get a sample mean of 33 grams?
- If the sample were random and the parent population were normally distributed, what is
the probability that you would get a mean weight of at least 35 grams?
Under these assumptions, ~ N(30, 1). Why is the sampling variance 1? It is the
population variance (25, the standard deviation squared) divided by the sample size (also
25).
So, the question is what is P( > 33)?
P( > 33) =
P(Z > (33 - 30)/1) =
P(Z > 3) =
0.0013
So, if the sample were random this result would be very unlikely. There seems to be
evidence that without a sampling design the students tended to select the larger mice.
(Remind me to tell the study of selecting Razorback suckers for treatment on the Green
river)
- On 9/24 a CNN/Time poll found that 51.76% of 1,011 New Hampshire residents who have a
preference prefer Bill Bradley to Al Gore.
- What is the probability of getting 51.75% or more of a sample of 1,011 to say they
prefer Bradley if, in reality, New Hampshire residents were evenly split between Bradley
and Gore?
This sample size (n = 1,011) is large enough to apply the Central Limit Theorem.
The parent population mean, if residents were evenly split between Bradley and Gore, is
p = 0.5 and the variance is p(1-p) = 0.25. So, ~ N(0.5,
0.25/1011). We are looking for P( > 0.5176).
P( > 0.5176) =
P(Z > (0.5176 - 0.5)/0.0157) =
P(Z > 1.12) =
0.1314
- What is the probability of getting the poll result if only 45% of New Hampshire
residents preferred Bradley?
In this case, ~ N(0.45,
0.45*(1-0.45)/1011). We are still looking for P( > 0.5176).
P( > 0.5176) =
P(Z > (0.5176 - 0.45)/0.0156) =
P(Z > 4.33) =
0
- Lovastatin Study:
For each subject in the Lovastatin study the percent change in cholesterol levels since
just prior to recruitment to the study was recorded. For the treatment group the mean
change in cholesterol level was -20% after 48 weeks. The standard error was reported as 2.
- What is the probability that percent reduction in cholesterol levels would be greater
than 20% if Lovastatin actually had no effect?
The sample size is n = 61 here and we will employ the Central Limit Theorem.
However, there is a new wrinkle here. We don't actually know , we only have
an estimate of the standard error based on the data. For large sample sizes we often
simply assume that the estimate of the population variance is close enough and proceed as
if were
known. So we will assume = 2, the number reported in
the JAMA article. So, if Lovastatin had no effect, ~ N(0, 4). [When we use the
notation N( , ) we always put
the variance second. The variance of a sampling distribution is the standard
error squared, 4 in this case]. We're looking for P( < -20).
P( < -20) =
P(Z < (-20 - 0)/2) =
P(Z < -10) =
0
- The control group saw a 3% reduction in cholesterol levels after 48 weeks with a
standard error of 1. What is the probability that a reduction of 3% or more would be
observed if in fact the placebo had no effect?
~ N(0,
2) so
P( < -3) =
P(Z < (-3 - 0)/1) =
P(Z < -3) =
0.0013
This provides evidence that the placebo reduces cholesterol levels? This is certainly
disconcerting. The sample size was n = 49 after 48 weeks but 65 subjects were
initially recruited. There is the possibility of volunteer error here. Why, and how could
it explain this result?
|
|