In addition to knowing how individual data values vary about the mean for a population, statisticians are interested in knowing how the means of samples of the same size taken from the same population vary about the population mean. (i.e., how do groups of data compare to the overall data?) This leads to arguably the most important topic in all of statistics.
Before we start formally learning about sampling distributions, Iβd like you to explore to see what happens when we take random samples from a population and calculate sample means.
Roll a die at least 20 times. Use the Excel file above to record what you get on each roll. Then use the Data Analysis βHistogramβ tool to visualize the distribution of rolls with the βBinsβ \(1,2,3,4,5,6\text{.}\) (Before visualizing the rolls, think about what you expect the distribution to look like.)
Now roll 2 dice and in the next sheet of the Excel file, record those two rolls and then find the average of those two rolls. Repeat this at least 20 times. Use Excel to create a histogram of the averages.
Now roll 3 dice and in the next sheet of the Excel file, record those three rolls and then find the average of those three rolls. Repeat this at least 20 times. Use Excel to create a histogram of the averages.
Next we want to do this for averages of a larger number of rolls. But this is rather time-consuming, so weβre going to use the link below to see how this would work if we rolled many more dice and looked at the distribution of the averages: https://math.bu.edu/people/rmagner/CLTdemo.html
A sampling distribution of the mean is a distribution of the mean from numerous samples of the sample size. This distribution has mean \(\mu_{\overline{x}}\) and a standard error (i.e. the standard deviation of the sample means) \(\sigma_{\overline{x}}\text{.}\)
There are three important properties that describe the distribution of sample means:
The sampling distribution of the mean of a random variable drawn from any population is approximately normal for sufficiently large sample size. The larger the sample size, the more closely the sample distribution resembles a normal distribution.
For a population with a mean equal to 250 and a standard deviation equal to 25, calculate the standard error of the mean for the following sample sizes.