Skip to main content

Section 10.2 The Nuts and Bolts of Hypothesis Testing

We will learn two different procedures for completing a hypothesis test, but before we describe each, let’s do an example of how one of these procedures would work.

Exercise 10.2.1.

Let’s say a group on campus is claiming that the average number of hours worked per week by students on campus is \(13.5\text{.}\)
We want to see if this claim is reasonable, so we are going to go out and survey students and find the average number of hours worked per week for students in our sample.

(a)

We survey \(200\) students, and the sample average is \(15\) hours per week. If the population mean really is \(\mu=13.5\text{,}\) what is the probability of getting a sample mean this extreme?

(b)

We survey \(15\) students, and the sample average is \(14\) hours per week. If the population mean really is \(\mu=13.5\text{,}\) what is the probability of getting a sample mean this extreme?
(Let’s use the link below to think about these questions.)

Definition 10.2.2.

  • The rejection region is the range of values for a test statistic that cause us to reject \(H_0\text{.}\)
  • A critical value is the boundary that separates the rejection region from the rest.
  • A p-value is the probability of observing a sample statistic at least as extreme as the one selected, assuming \(H_0\) is true.

Exercise 10.2.3.

Exercise 10.2.4.

Choose the correct p-value for each situation. (I recommend drawing a picture of the rejection region to help with this.)

(a)

    A left-tailed test with test statistic \(z_{\overline{x}}=-2.75\)
  • \(\approx 0.0030\)
  • \(\approx 0.0287\)
  • \(\approx 0.0658\)
  • \(\approx 0.0124\)

(b)

    A right-tailed test with test statistic \(z_{\overline{x}}=1.90\)
  • \(\approx 0.0030\)
  • \(\approx 0.0287\)
  • \(\approx 0.0658\)
  • \(\approx 0.0124\)

(c)

    A two-tailed test with test statistic \(z_{\overline{x}}=-1.84\)
  • \(\approx 0.0030\)
  • \(\approx 0.0287\)
  • \(\approx 0.0658\)
  • \(\approx 0.0124\)

(d)

    A two-tailed test with test stat. \(z_{\overline{x}}=2.50\)
  • \(\approx 0.0030\)
  • \(\approx 0.0287\)
  • \(\approx 0.0658\)
  • \(\approx 0.0124\)

Subsection 10.2.1 The Logic of Hypothesis Testing

The hypothesis test begins with the assumption that the null hypothesis, \(H_0\text{,}\) is true. The goal of the process is to determine if there is enough evidence provided by the sample to infer that the alternative hypothesis, \(H_1\text{,}\) might be true instead.
The null hypothesis can never be accepted. The most we can say is that we do not have enough evidence to reject the null. The only two options are to:
  1. reject the null hypothesis;
  2. fail to reject the null hypothesis.
So how is the decision made to “reject” or “fail to reject” within each of our two hypothesis testing methods?
  • Traditional (Critical Value) Method:
    In the traditional method, the decision is made by comparing the test statistic to the critical value(s). When the test statistic falls in the rejection region, the decision is “reject the null hypothesis”. When the test statistic does not fall in the rejection region, the decision is “fail to reject the null hypothesis”.
  • p-value Method:
    In the p-value method, the decision is made by comparing the p-value to the significance level. When the p-value is smaller than the significance level, the decision is “reject the null hypothesis”. When the p-value is greater than or equal to the significance level, the decision is “fail to reject the null hypothesis”.

Subsection 10.2.2 Two approaches to Hypothesis Testing

Now we are finally ready to describe in depth two hypothesis testing procedures and work through some examples. First, let’s summarize the steps for the traditional method (aka the critical value method).

Definition 10.2.5.

The Traditional (Critical Value) Method of Hypothesis Testing:
  1. Identify the two hypotheses using appropriate notation.
  2. Draw the appropriate curve, identify the significance level, and label critical value(s).
  3. Calculate the appropriate test statistic.
  4. Compare the critical value(s) to the test statistic and make the decision.
  5. State the conclusion.
Let’s work through an example using the traditional method. In this example, the appropriate test statistic formula is:
\begin{equation*} z_{\overline{x}}=\frac{\overline{x}-\mu}{\sigma/\sqrt{n}} \end{equation*}

Exercise 10.2.6.

(Donnelly 9.7)
A pizza place recently hired additional drivers and as a result now claims that its average delivery time for orders is under 46 minutes. A sample of 41 customer deliveries was examined, and the average delivery time was found to be 41.5 minutes. Historically, the standard deviation for delivery time is 11.8 minutes. Assuming that \(\alpha = 0.01\text{,}\) does this sample provide enough evidence to support the delivery time claim made by the pizza place?
(a)
Step 1: Identify the two hypotheses using appropriate notation.
Answer.
  • \(H_0\text{:}\) \(\mu\geq 46\)
  • \(H_1\text{:}\) \(\mu\lt 46\) (\(\leftarrow \) left-tail test)
(b)
Step 2: Draw the appropriate curve, identify the significance level, and label critical value(s).
Answer.
Figure 10.2.7. powered by Desmos
(c)
Step 3: Calculate the appropriate test statistic.
Answer.
\begin{equation*} z_{\overline{x}}=\frac{41.5-46}{11.8/\sqrt{41}}\approx -2.44 \end{equation*}
(d)
Step 4: Compare the critical value(s) to the test statistic and make the decision.
Answer.
Test statistic is left of the critical value so it lands in the rejection region, and we reject \(H_0\text{.}\)
(e)
Step 5: State the conclusion.
Answer.
There is enough evidence to support the claim that the average delivery time for orders is under 46 minutes.

Exercise 10.2.8.

(Donnelly 9.10)
A grocery store claims that customers spend an average of 5 minutes waiting for service at the store’s deli counter. A random sample of 40 customers was timed at the deli counter, and the average service time was found to be 5.5 minutes. Assume the standard deviation is 1.7 minutes per customer. Assuming that \(\alpha = 0.05\text{,}\) does this sample provide enough evidence to counter the claim made by the store’s management?
(a)
Step 1: Identify the two hypotheses using appropriate notation.
Answer.
  • \(H_0\text{:}\) \(\mu=5\)
  • \(H_1\text{:}\) \(\mu\neq 5\) (\(\leftarrow \) two-tail test)
(b)
Step 2: Draw the appropriate curve, identify the significance level, and label critical value(s).
Answer.
Figure 10.2.9. powered by Desmos
\(-z_{.025}=NORM.S.INV(.025)\approx -1.96\)
By symmetry, \(z_{.025}\approx 1.96\)
(c)
Step 3: Calculate the appropriate test statistic.
Answer.
\begin{equation*} z_{\overline{x}}=\frac{5.5-5}{1.7/\sqrt{40}}\approx 1.86 \end{equation*}
(d)
Step 4: Compare the critical value(s) to the test statistic and make the decision.
Answer.
The test statistic is between the two critical values so it doesn’t land in the rejection region, and we fail to reject \(H_0\text{.}\)
(e)
Step 5: State the conclusion.
Answer.
The sample does not provide enough evidence to counter the claim that customers spend an average of 5 minutes waiting for service.

Definition 10.2.10.

The p- Method of Hypothesis Testing:
  1. Identify the two hypotheses using appropriate notation.
  2. Draw the appropriate curve, identify the significance level.
  3. Calculate the appropriate test statistic and the associated p-value.
  4. Compare the p-value and the significance level and make the decision.
  5. State the conclusion.

Exercise 10.2.11.

(Donnelly 9.8)
A sporting goods store believes the average age of its customers is 38 or less. A random sample of 40 customers was surveyed, and the average customer age was found to be 41.2 years. Assume the standard deviation for customer age is 9.0 years. Assuming that \(\alpha = 0.01\text{,}\) does the sample provide enough evidence to refute the age claim made by the sporting goods store?
(a)
Step 1: Identify the two hypotheses using appropriate notation.
Answer.
  • \(H_0\text{:}\) \(\mu\leq 38\)
  • \(H_1\text{:}\) \(\mu\gt 38\) (\(\leftarrow \) right-tail test)
(b)
Step 2: Draw the appropriate curve, identify the significance level.
Answer.
Figure 10.2.12. powered by Desmos
(c)
Step 3: Calculate the appropriate test statistic and the associated p-value.
Answer.
\begin{equation*} z_{\overline{x}}=\frac{41.2-38}{9/\sqrt{40}}\approx 2.25 \end{equation*}
\begin{equation*} \text{p-value}=1-NORM.S.DIST(2.25,1)\approx 0.0122 \end{equation*}
(d)
Step 4: Compare the p-value and the significance level and make the decision.
Answer.
The p-value\(\approx 0.0122\gt 0.01=\alpha\text{.}\)
Therefore we fail to reject \(H_0\text{.}\)
(e)
Step 5: State the conclusion.
Answer.
The sample does not provide enough evidence to refute the claim that the average age of customers is less than or equal to 38.