Section Confidence Intervals and Hypothesis Testing Worksheet
Our goal is to use a sample to understand what is happening in our population. We are going to focus on estimating means and proportions for our population using a sample.
Subsection Confidence Intervals for the Mean
Cases that we need to consider separately:
Population standard deviation (\(\sigma\)) is known:
\begin{equation*}
\bar{x}\pm z_{\alpha/2}\left(\frac{\sigma}{\sqrt{n}}\right)
\end{equation*}
Population standard deviation (\(\sigma\)) is not known, and we only know the sample standard deviation:
\begin{equation*}
\bar{x}\pm t_{\alpha/2}\left(\frac{s}{\sqrt{n}}\right)
\end{equation*}
Subsection Confidence Interval for the Proportion
\begin{equation*}
\overline{p}\pm (z_{\alpha/2})\cdot\left(\sqrt{\frac{\overline{p}(1-\overline{p})}{n}}\right)
\end{equation*}
Example 7.1. Credit Balance Confidence Interval.
This file shows credit balances for a sample of customers. Construct a \(95\%\) confidence interval in Excel for the average credit balance.
Subsection Confidence Intervals in Tableau
Example 7.2.
Tableau can also construct confidence intervals! Let’s construct the confidence interval from
Example 7.1 in Tableau.
Subsection Real-World Examples of Finding Confidence Intervals in Excel
Example 7.3. US Census Data Confidence Intervals.
(a)
Use the sample data to estimate the average number of hours worked per week in the US among people who work a positive number of hours each week. (Look at “pehract1”.)
(b)
Use the sample data to estimate the proportion of households in the US in which someone owns a business or a farm. (Look at “HUBUS”.)
Subsection Hypothesis Testing
We have shown how a sample could be used to develop a point and interval estimates of population parameters such as the mean (\(\mu\)) and the proportion (\(p\)). Now we will continue the discussion of statistical inference by showing how hypothesis testing can be used to determine whether a statement about the value of a population parameter should or should not be rejected.
Definition 7.4.
A hypothesis is an assumption about a population parameter.
The null hypothesis, denoted \(H_0\text{,}\) represents the status quo and involves stating the belief that the population parameter is \(=,\leq, \geq\) a specific value
The alternative hypothesis, denoted \(H_1\text{,}\) represents the opposite of \(H_0\text{,}\) and is believed to be true if the null hypothesis is found to be false.
Example 7.5.
\(H_0\text{:}\) No more than \(30\%\) of the registered voters in Santa Clara County voted in the primary election.
\(H_1\text{:}\) More than \(30\%\) of the registered voters in Santa Clara County voted in the primary election.
Subsection Test Statistics for Hypothesis Tests
Test statistic for hypothesis tests about a population mean if \(\sigma\) is known:
\begin{equation*}
z = \frac{\overline{x}-(\mu)_{H_0}}{\sigma/\sqrt{n}}
\end{equation*}
Test statistic for hypothesis tests about a population mean if \(\sigma\) is NOT known:
\begin{equation*}
t = \frac{\overline{x}-(\mu)_{H_0}}{s/\sqrt{n}}
\end{equation*}
Test statistic for hypothesis tests about a population proportion:
\begin{equation*}
z = \frac{\overline{p}-(p)_{H_0}}{\sqrt{\frac{(p)_{H_0}(1-(p)_{H_0})}{n}}}
\end{equation*}
Subsection p-value
Definition 7.6.
A p-value is the probability, assuming that \(H_0\) is true, of obtaining a random sample of size \(n\) that results in a test statistic at least as extreme as the one observed in the current sample.
Example 7.7.
We want to test the hypothesis that the proportion of households in the US in which someones owns a business or a farm is
\(10.4\%\) using the US Census sample data from October 2023::
https://bit.ly/3SskrqA
Determine the null and alternative hypotheses and find the test statistic.
Then find the p-value and decide what you can conclude based on this sample.