Skip to main content

Section 13.4 Diamonds Modeling

(Background on diamonds was provided in part by research done at https://www.brilliance.com/education/diamonds/depth-table)
When a gemologist determines the value of a diamond, he or she considers a number of different factors. These factors are known as the 4C’s (carat, color, clarity, and cut). How large is the diamond? The size (really the weight) of the diamond is measured in carats. Is it colorless or does it have a slight hue of color? Are there any visible inclusions in the diamond (this affects the clarity)?
Was the stone cut well? This is described by both the table and the depth. They both helps to define the physical shape of a diamond and contribute to its sparkle. When these two features are proportioned just right, a diamond of any size looks spectacular.
Every diamond has a flat, square-shaped facet on its top called the table. It plays a critical role in a diamond’s appearance. The table refracts rays of light as they pass, directing them to the facets that make the diamond look so sparkly. The physical size of the table facet naturally varies depending on the overall size of the diamond. Jewelers measure the table percentage when grading a diamond’s cut. Table percentage is calculated by dividing the width of the table by the overall width of the diamond. The ideal table percentage will vary based on the shape of the diamond.
The depth of a diamond might also be called the “height”: it is the distance from the table to the culet (the pointed tip) of the diamond. Like with a diamond’s table, jewelers grade a diamond’s depth based on its depth percentage. Depth percentage is the diamond’s depth divided by the width of the diamond. This percentage dictates the overall proportions of the diamond, which in turn directly impact how light reflects off the facets in the stone.

Exercise 13.4.1.

Here is the file we are going to use for this problem: external/sheets/Diamonds.xlsx
The price of diamonds is not just determined by size, but by multiple characteristics. For simplicity, in this example we will start with size. Create the scatterplot for the two variables “Price” and “Carat”.
(The solutions to all parts of this problem are here: external/sheets/DiamondsSimpleLinearRegressionSolutions.xlsx)

(a)

Determine the least squares line. Interpret the slope coefficient.

(b)

According to your least squares line, what would you expect the price to be for a 5 carat diamond?

(c)

Compute the coefficient of correlation. What does it tell you about the relationship between the size of a diamond and the price?

Exercise 13.4.2.

We’re going to use the same file we used in the previous problem: external/sheets/Diamonds.xlsx
Now we will take into consideration the other characteristics of diamonds that determine price: carat, color, clarity, and cut. Let’s redo the regression and create a model that does a better job than the one in the previous example that only included a single predictor variable.
(The solutions to all parts of this problem are here: external/sheets/DiamondsMultipleRegressionSolutions.xlsx)

(a)

Create the regression model with all 5 variables that are provided: color, depth (in percentage), clarity, table (in percentage), and carat.

(b)

Interpret each of the coefficients in the model.

(c)

Do all 5 of the predictor variables belong in the model? Why or why not?

(d)

Is the overall model valid? Why or why not?

(e)

What is the coefficient of determination? What does it tell you about the model?

(f)

Remove variables one by one and reassess the model. What is the best model to predict the price of a diamond? Why?