Skip to main content

Section 2.1 Quantitative Data Displays

Definition 2.1.1.

  • frequency distribution: a table that shows the number of data observations that fall into specific intervals or categories
  • class: a category in a frequency distribution
  • relative frequency distribution: displays the proportion of observations of each class relative to the total number of observations
  • cumulative frequency distribution: totals the proportion of observations that are less than or equal to the class at which you are looking

Exercise 2.1.2.

Definition 2.1.3.

  • histogram: a graph showing the number of observations in each class of a frequency distribution
  • ogive: a line graph that plots the cumulative relative frequency distribution
Let’s look at both tabs at the link below to see an examples of these types of graphs/charts: UCCS Student Age Example
 1 
bit.ly/3XxmsVN

Exploration 2.1.1.

Let’s look at the file below, which has the number of iPads sold at a particular store over a certain number of days. We’re going to make a frequency distribution for this data. external/sheets/iPadSales.xlsx
(The manager might want to know, for example, the percentage of days that 3 or fewer iPads were sold.)
  • Highlight D2:D7
    (Note: the “Bins” are the classes in our distribution.)
  • type =FREQUENCY(A2:A51,C2:C7)
  • Hit \(\text{Control}+\text{Shift}+\text{Enter}\) together
Here is the frequency distribution we should get when we do this:

Subsection 2.1.1 Creating Histograms in Excel

Next we want to create histograms in Excel, but before doing that, we’re going to download the Data Analysis tool in Excel that we will use to create histograms. (Later in the course we will use this tool for many more things!)

Note 2.1.4.

Go to the link below to see instructions for downloading the Data Analysis tool in Excel:
Analysis ToolPak in Excel
 2 
support.microsoft.com/en-us/office/load-the-analysis-toolpak-in-excel-6a63e598-cd6d-42e3-9317-6b40ba1a66b4#OfficeVersion=Windows
(Note that there are different tabs with instructions for Windows and macOS.)

Exploration 2.1.2.

Let’s go back to the iPad sales example from Exploration 2.1.1.
We’re going to create a histogram for this data in two different ways. First we’re going to create a histogram using the Data Analysis tool that we just downloaded.
  • In the Excel file, go to the Data tab, and then on the right click on “Data Analysis”. (If “Data Analysis” is not showing up, then the add-in has not been installed yet.)
  • Choose the “Histogram” option.
  • Choose the data as the “Input range”, specify the bins, choose “Chart Output”, and specify the output range.
Here is the histogram and frequency distribution that we get by following these instructions:
Does the frequency distribution that we got using the Data Analysis tool match the frequency distribution we got in Exploration 2.1.1?

Exploration 2.1.3.

Now let’s create a histogram using Excel Charts instead of the Data Analysis tool.
Go back to the iPad sales example from Exploration 2.1.1.
  • Select the data (A1:A51) and then go to “Insert” and choose the Histogram option:
  • Here is what we should get when we do this:
Does the chart we got here look as good as the one we got using the Data Analysis add-in?
There are some tweaks that we will make to the chart we got here to make it looks better, but the chart we created using the Data Analysis tool looked much better without having to do extra work tweaking it.
(There are more details in the textbook about the steps we’ll work through in class.)

Activity 2.1.4.

(Optional)
Figure 2.1.5. Frequency and Relative Frequency Distributions (Made in GeoGebra by David Gurney)

Subsection 2.1.2 Grouped Quantitative Data

\(2^k\geq n\) Rule:”

Definition 2.1.6.

\(2^k\geq n\) Rule:
If we want to decide how many classes to use in a frequency distribution or histogram, we can let
\(k\) \(=\) number of classes
\(n\) \(=\) number of data points
The goal is to find the lowest value of \(k\) that satisfies the inequality \(2^k\geq n\text{.}\)

Example 2.1.7.

Let’s look at the Excel file below and find the number of classes, \(k\text{,}\) that we should use to create a frequency distribution.

Answer.

\(n=50\text{,}\) so we want \(k\) so that \(2^k\geq 50\text{.}\)
  • \(2^5=32\lt 50\text{,}\) so \(5\) is too small
  • \(2^6=64\geq 50\text{,}\) so \(k=6\) is a good choice.
Once the number of classes, \(k\text{,}\) is decided, we must determine the width of each class. The width is the range of the numbers that are put into each class. The following formula calculates a good width:
\begin{equation*} \boxed{\text{Estimated class width: } \frac{\text{Maximum}-\text{Minimum}}{k}} \end{equation*}

Exercise 2.1.8.

Let’s look back at the file from Example 2.1.7.
We found that \(k=6\) is a good choice for the number of classes, and let’s use that the find a good width to use for each class.
Answer.
Estimated class width:
\begin{equation*} \frac{17.4-0.6}{6}=\frac{16.8}{6}=2.8 \end{equation*}
Let’s round up to a useful whole number that makes the distribution more readable:
\begin{equation*} \text{Class width}=3 \end{equation*}

Exercise 2.1.9.

Let’s look back at the file from Example 2.1.7.
We found that a good class width would be 3, so let’s make a frequency distribution with 6 classes, each with a width of 3.
(I recommend first sorting the data to make your distribution.)

Exercise 2.1.10.

In Exercise 2.1.9 you created a frequency distribution for the data in the Dell Hold Times file.
Let’s use Excel to create a frequency distribution without having to count the number of items in each class ourselves.
In the file below, we’re going to create a “Bins” column that has an upper bound for each class, and then we’re going to use the bins to create the frequency distribution. external/sheets/DellHoldTimesFrequencyBins.xlsx
  • In the “Bins” column, enter an upper bound for each class. (For example, in E9, we could enter \(2.9\text{.}\))
  • Our Bins are in Column E in the file above. Now highlight F9:F14 and type =FREQUENCY(A2:A51, E9:E14). Then hit enter.
Answer.
We should get the frequency distribution below when doing this.

Exercise 2.1.11.

(Donnelly 2.9)
The table in the Excel file below lists the receipt total for 350 randomly selected customers for the home improvement store Lowe’s.
  1. Using Excel and the \(2^k\geq n\) rule, construct a frequency distribution for the data.
  2. Using the \(k\) found above, calculate the relative frequencies for each class.
  3. Using the results from above, calculate the cumulative relative frequencies for each class.
  4. Construct a histogram.
Answer.

Subsection 2.1.3 Shapes of Histograms

Definition 2.1.12.

  • A distribution of data is symmetric if the right and left sides of the distribution are mirror images
  • A distribution of data is skewed if a large number of data items are piled up at one end or the other, with a “tail” at the opposite end.
    • skewed to the right: the “tail” is to the right
    • skewed to the left: the “tail” is to the left

Exercise 2.1.13.

Figure 2.1.14. Shapes of Distributions (Made in GeoGebra by Samantha Garcia and Lauren Nelsen)

Subsection 2.1.4 Scatter Plots

Definition 2.1.15.

Scatter plots provide a visual of the relationship between two quantitative variables -- the independent and dependent variables.
  • The dependent variable is placed on the vertical axis of the scatter plot and is influenced by changes in the independent variable, which is placed on the horizontal axis.

Exercise 2.1.16.

Let’s look at the example below with home prices and square footage:
Figure 2.1.17. Scatter Plot powered by Desmos
What is the independent variable and what is the dependent variable?
Answer.
The independent variable is the square footage, and the dependent variable is home price. (The home price depends on the square footage.)

Exercise 2.1.18.

(Donnelly 2.58)
A marketing research firm would like to display the relationship between a family’s monthly food costs and the number of family members living in a househould. The data in the Excel file below contains the monthly food costs and the number of family members for 16 families. Construct a scatter plot and describe the relationship between the number of household members and the family’s monthly food costs.
Answer.

Definition 2.1.19.

A line chart is a special type of scatterplot in which the data points in the scatter plot are connected with a line.
Below is an example of a line chart. (Check the box to see the line chart.)
Figure 2.1.20.
Line Chart Examples (Made in GeoGebra by David Gurney)