13.5 Correlation Coefficient r

The correlation coefficient [latex]r[/latex] is calculated by

[latex]r = \frac{S_{xy}}{\sqrt{S_{xx} \times S_{yy}}}[/latex]

where

[latex]S_{xy} = \sum x_i y_i - \frac{\left( \sum x_i \right) \left( \sum y_i \right)}{n}, S_{xx} = \sum x_i^2 - \frac{\left( \sum x_i \right)^2}{n}, S_{yy} = \sum y_i^2 - \frac{\left( \sum y_i \right)^2}{n}.[/latex]

The correlation coefficient [latex]r[/latex] measures the association between the response variable [latex]y[/latex] and the predictor variable [latex]x[/latex] in the following three aspects:

  • Pattern: The correlation coefficient [latex]r[/latex] measures linear association. Do NOT use the correlation coefficient [latex]r[/latex] to describe non-linear association.
  • Strength: The closer [latex]r[/latex] is to either +1 or -1, the stronger the linear association. When [latex]r = \pm 1[/latex], [latex]y[/latex] and [latex]x[/latex] have a perfect linear association. That is, all the data points in the scatter plot of [latex]x[/latex] versus [latex]y[/latex] fall in a straight line.
  • Direction: Positive or negative. Positive association [latex](r > 0)[/latex] means that [latex]y[/latex] and [latex]x[/latex] change in the same direction. That is [latex]y[/latex] increases (decreases) if [latex]x[/latex] increases (decreases). Negative association [latex](r < 0)[/latex] means that [latex]y[/latex] and [latex]x[/latex] change in the opposite direction, that is, [latex]y[/latex] increases (decreases) if [latex]x[/latex] decreases (increases).

Exercise: Correlation Coefficient

Match the following correlation coefficients with the scatter plots.
(1) 0.989 (2) 0.697 (3) -0.887 (4) -0.020

(a) (b) (c) (d)
A scatter plot with a distinct u-shape. Image description available. A scatter plot with an upward trend. The data falls in a wide band. Image description available. A scatter plot with a downward trend. The data falls in a wide band. Image description available. A scatter plot with an upward trend. The data falls in a very tight band.
Figure 13:7: Match Correlation Coefficients and Scatter Plots. [Image Description (See Appendix D Figure 13.7)]
Show/Hide Answer

Answers:

  1. [latex]r=-0.020[/latex]. There is no linear association between [latex]y[/latex] and [latex]x[/latex], [latex]r[/latex] should be close to 0.
  2. [latex]r=0.697[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] also increases. There should be a positive linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r > 0[/latex]. But it is not extremely strong since the points show little semblance of a straight line.
  3. [latex]r=-0.887[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] decreases. There should be a negative linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r < 0[/latex]. The association is quite strong since the points are starting to resemble the rough appearance of a straight line.
  4. [latex]r=0.989[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] also increases. There should be a positive linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r > 0[/latex]. The association is extremely strong since the points are basically on a straight line.

 

Exercise: Concepts on Correlation Coefficient

Explain whether the following statements are true or false. Correct them if they are false.

  1. If [latex]r \approx 0[/latex], there is no association between [latex]y[/latex] and [latex]x[/latex].
  2. The larger the value of [latex]r[/latex], the stronger the association between [latex]y[/latex] and [latex]x[/latex].
Show/Hide Answer
  1. False. If [latex]r \approx 0[/latex], there is no linear association between[latex]y[/latex] and [latex]x[/latex].
  2. False. The larger the absolute value of [latex]r[/latex], the stronger the linear association. Or the closer [latex]r[/latex] is to +1 or -1, the stronger the linear association.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.