home.gif (1194 bytes)grades.gif (1215 bytes)assignments.gif (1284 bytes)feedback.gif (1254 bytes)discboard.gif (1264 bytes)

syllabus.gif (1124 bytes)terminology.gif (1142 bytes)lectures.gif (1112 bytes)resources.gif (1130 bytes)jmp.gif (1086 bytes)

 

title.gif (3960 bytes)

 

Applications of Sampling Distributions and the Central Limit Theorem

 

  1. A university professor has several hundred mice with mean weight of 30 grams and a standard deviation of 5 grams. The professor instructs his students to go grab 25 mice haphazardly, without any specific plan. The mean weight of the 25 mice the students selected was 33 grams.
     
    1. Was the sample of mice a random sample?
       
    2. If the sample were random, what would you expect the mean weight of the mice in the sample to be, on average?
       
    3. Assume that the sample was random and that the distribution of weight of all mice was normal. What is the probability that you would get a sample mean of 33 grams?
       
    4. If the sample were random and the parent population were normally distributed, what is the probability that you would get a mean weight of at least 35 grams?
       
      Under these assumptions, xbar.gif (869 bytes) ~ N(30, 1). Why is the sampling variance 1? It is the population variance (25, the standard deviation squared) divided by the sample size (also 25).
       
      So, the question is what is P(xbar.gif (869 bytes) > 33)?
       
      P(xbar.gif (869 bytes) > 33) =
      P(Z > (33 - 30)/1) =
      P(Z > 3) =
      0.0013
       
      So, if the sample were random this result would be very unlikely. There seems to be evidence that without a sampling design the students tended to select the larger mice.
       
      (Remind me to tell the study of selecting Razorback suckers for treatment on the Green river)
       
  2. On 9/24 a CNN/Time poll found that 51.76% of 1,011 New Hampshire residents who have a preference prefer Bill Bradley to Al Gore.
     
    1. What is the probability of getting 51.75% or more of a sample of 1,011 to say they prefer Bradley if, in reality, New Hampshire residents were evenly split between Bradley and Gore?
       
      This sample size (n = 1,011) is large enough to apply the Central Limit Theorem. The parent population mean, if residents were evenly split between Bradley and Gore, is p = 0.5 and the variance is p(1-p) = 0.25. So, phat.gif (874 bytes) ~ N(0.5, 0.25/1011). We are looking for P(phat.gif (874 bytes) > 0.5176).
       
      P(phat.gif (874 bytes) > 0.5176) =
      P(Z > (0.5176 - 0.5)/0.0157) =
      P(Z > 1.12) =
      0.1314
       
    2. What is the probability of getting the poll result if only 45% of New Hampshire residents preferred Bradley?
       
      In this case, phat.gif (874 bytes) ~ N(0.45, 0.45*(1-0.45)/1011). We are still looking for P(phat.gif (874 bytes) > 0.5176).
       
      P(phat.gif (874 bytes) > 0.5176) =
      P(Z > (0.5176 - 0.45)/0.0156) =
      P(Z > 4.33) =
      0
       
  3. Lovastatin Study: For each subject in the Lovastatin study the percent change in cholesterol levels since just prior to recruitment to the study was recorded. For the treatment group the mean change in cholesterol level was -20% after 48 weeks. The standard error was reported as 2.
     
    1. What is the probability that percent reduction in cholesterol levels would be greater than 20% if Lovastatin actually had no effect?
       
      The sample size is n = 61 here and we will employ the Central Limit Theorem. However, there is a new wrinkle here. We don't actually know sigma2.gif (310 bytes), we only have an estimate of the standard error based on the data. For large sample sizes we often simply assume that the estimate of the population variance is close enough and proceed as if sigma2.gif (310 bytes) were known. So we will assume sigma_xbar.gif (902 bytes) = 2, the number reported in the JAMA article. So, if Lovastatin had no effect, xbar.gif (869 bytes) ~ N(0, 4). [When we use the notation N(mu.gif (285 bytes), sigma2.gif (310 bytes)) we always put the variance second. The variance of a sampling distribution is the standard error squared, 4 in this case]. We're looking for P(xbar.gif (869 bytes) < -20).
       
      P(xbar.gif (869 bytes) < -20) =
      P(Z < (-20 - 0)/2) =
      P(Z < -10) =
      0
       
    2. The control group saw a 3% reduction in cholesterol levels after 48 weeks with a standard error of 1. What is the probability that a reduction of 3% or more would be observed if in fact the placebo had no effect?
       
      xbar.gif (869 bytes) ~ N(0, 2) so
       
      P(xbar.gif (869 bytes) < -3) =
      P(Z < (-3 - 0)/1) =
      P(Z < -3) =
      0.0013
       
      This provides evidence that the placebo reduces cholesterol levels? This is certainly disconcerting. The sample size was n = 49 after 48 weeks but 65 subjects were initially recruited. There is the possibility of volunteer error here. Why, and how could it explain this result?

 

E-mail Mr. Callahan at stat110@edcallahan.com with questions or comments about this web site or about the class itself.

This page was last modified on October 22, 1999.