Skip to main content

Section 8.1 Sampling Distribution of the Mean

In addition to knowing how individual data values vary about the mean for a population, statisticians are interested in knowing how the means of samples of the same size taken from the same population vary about the population mean. (i.e., how do groups of data compare to the overall data?) This leads to arguably the most important topic in all of statistics.

Example 8.1.1.

We’re going to do an activity where we roll dice and think about the distributions of the averages of our rolls.
Use the Excel file below to record your rolls for this activity. external/sheets/DiceRollBlank.xlsx
(If you don’t have dice, you can use the tool below.)
Figure 8.1.2. (Made in GeoGebra by Duane Habecker)

(a)

Roll a die at least 20 times. Use the Excel file above to record what you get on each roll. Then use the Data Analysis “Histogram” tool to visualize the distribution of rolls with the “Bins” \(1,2,3,4,5,6\text{.}\) (Before visualizing the rolls, think about what you expect the distribution to look like.)

(b)

Now roll 2 dice and in the next sheet of the Excel file, record those two rolls and then find the average of those two rolls. Repeat this at least 20 times. Use Excel to create a histogram of the averages.

(c)

Now roll 3 dice and in the next sheet of the Excel file, record those three rolls and then find the average of those three rolls. Repeat this at least 20 times. Use Excel to create a histogram of the averages.

(d)

Next we want to do this for averages of a larger number of rolls. But this is rather time-consuming, so we’re going to use the link below to see how this would work if we rolled many more dice and looked at the distribution of the averages: https://math.bu.edu/people/rmagner/CLTdemo.html
Figure 8.1.3. 3Blue1Brown Video: “But what is the Central Limit Theorem?”

Definition 8.1.4.

A sampling distribution of the mean is a distribution of the mean from numerous samples of the sample size. This distribution has mean \(\mu_{\overline{x}}\) and a standard error (i.e. the standard deviation of the sample means) \(\sigma_{\overline{x}}\text{.}\)
There are three important properties that describe the distribution of sample means:
  1. The sampling distribution of the mean of a random variable drawn from any population is approximately normal for sufficiently large sample size. The larger the sample size, the more closely the sample distribution resembles a normal distribution.
  2. The mean of the sample means will be the same as the population mean; that is,
    \begin{equation*} \mu_{\overline{x}}=\mu\text{.} \end{equation*}
  3. The standard deviation of the sample means will be smaller than the standard deviation of the population; specifically,
    \begin{equation*} \sigma_{\overline{x}}=\frac{\sigma}{\sqrt{n}}. \end{equation*}

Exercise 8.1.5.

(Donnelly 7.7)
For a population with a mean equal to 250 and a standard deviation equal to 25, calculate the standard error of the mean for the following sample sizes.

(a)

20
Answer.
\(\sigma_{\overline{x}}=\frac{25}{\sqrt{20}}\approx 5.590\)

(b)

50
Answer.
\(\sigma_{\overline{x}}=\frac{25}{\sqrt{50}}\approx 3.536\)

(c)

80
Answer.
\(\sigma_{\overline{x}}=\frac{25}{\sqrt{80}}\approx 2.795\)

Exercise 8.1.6.

    What can you conclude about the standard error as the sample size increases?
  • As the sample size increases, standard error decreases
  • As the sample size increases, standard error increases.
  • As the sample size increases, standard error stays the same.
  • As the sample size increases, the standard error might increase, decrease, or stay the same.