Skip to main content

Section Confidence Intervals and Hypothesis Testing Worksheet

Our goal is to use a sample to understand what is happening in our population. We are going to focus on estimating means and proportions for our population using a sample.

Subsection Confidence Intervals for the Mean

Cases that we need to consider separately:
  • Population standard deviation (\(\sigma\)) is known:
    \begin{equation*} \bar{x}\pm z_{\alpha/2}\left(\frac{\sigma}{\sqrt{n}}\right) \end{equation*}
  • Population standard deviation (\(\sigma\)) is not known, and we only know the sample standard deviation:
    \begin{equation*} \bar{x}\pm t_{\alpha/2}\left(\frac{s}{\sqrt{n}}\right) \end{equation*}

Subsection Confidence Interval for the Proportion

\begin{equation*} \overline{p}\pm (z_{\alpha/2})\cdot\left(\sqrt{\frac{\overline{p}(1-\overline{p})}{n}}\right) \end{equation*}

Example 7.1. Credit Balance Confidence Interval.

Open the file “NewBalance”. external/sheets/NewBalance.xlsx
This file shows credit balances for a sample of customers. Construct a \(95\%\) confidence interval in Excel for the average credit balance.

Subsection Confidence Intervals in Tableau

Example 7.2.

Tableau can also construct confidence intervals! Let’s construct the confidence interval from Example 7.1 in Tableau.

Subsection Real-World Examples of Finding Confidence Intervals in Excel

Example 7.3. US Census Data Confidence Intervals.

Let’s look at US Census data from October 2023: https://bit.ly/3SskrqA
(a)
Use the sample data to estimate the average number of hours worked per week in the US among people who work a positive number of hours each week. (Look at “pehract1”.)
(b)
Use the sample data to estimate the proportion of households in the US in which someone owns a business or a farm. (Look at “HUBUS”.)

Subsection Hypothesis Testing

We have shown how a sample could be used to develop a point and interval estimates of population parameters such as the mean (\(\mu\)) and the proportion (\(p\)). Now we will continue the discussion of statistical inference by showing how hypothesis testing can be used to determine whether a statement about the value of a population parameter should or should not be rejected.

Definition 7.4.

  • A hypothesis is an assumption about a population parameter.
  • The null hypothesis, denoted \(H_0\text{,}\) represents the status quo and involves stating the belief that the population parameter is \(=,\leq, \geq\) a specific value
  • The alternative hypothesis, denoted \(H_1\text{,}\) represents the opposite of \(H_0\text{,}\) and is believed to be true if the null hypothesis is found to be false.

Example 7.5.

  • \(H_0\text{:}\) No more than \(30\%\) of the registered voters in Santa Clara County voted in the primary election.
  • \(H_1\text{:}\) More than \(30\%\) of the registered voters in Santa Clara County voted in the primary election.

Subsection Test Statistics for Hypothesis Tests

  • Test statistic for hypothesis tests about a population mean if \(\sigma\) is known:
    \begin{equation*} z = \frac{\overline{x}-(\mu)_{H_0}}{\sigma/\sqrt{n}} \end{equation*}
  • Test statistic for hypothesis tests about a population mean if \(\sigma\) is NOT known:
    \begin{equation*} t = \frac{\overline{x}-(\mu)_{H_0}}{s/\sqrt{n}} \end{equation*}
  • Test statistic for hypothesis tests about a population proportion:
    \begin{equation*} z = \frac{\overline{p}-(p)_{H_0}}{\sqrt{\frac{(p)_{H_0}(1-(p)_{H_0})}{n}}} \end{equation*}

Subsection p-value

Definition 7.6.

A p-value is the probability, assuming that \(H_0\) is true, of obtaining a random sample of size \(n\) that results in a test statistic at least as extreme as the one observed in the current sample.

Example 7.7.

We want to test the hypothesis that the proportion of households in the US in which someones owns a business or a farm is \(10.4\%\) using the US Census sample data from October 2023:: https://bit.ly/3SskrqA
(Look at “HUBUS” like we did in Example 7.3.)
Determine the null and alternative hypotheses and find the test statistic.
Then find the p-value and decide what you can conclude based on this sample.