13.5 Correlation Coefficient r
The correlation coefficient [latex]r[/latex] is calculated by
[latex]r = \frac{S_{xy}}{\sqrt{S_{xx} \times S_{yy}}}[/latex]
where
[latex]S_{xy} = \sum x_i y_i - \frac{\left( \sum x_i \right) \left( \sum y_i \right)}{n}, S_{xx} = \sum x_i^2 - \frac{\left( \sum x_i \right)^2}{n}, S_{yy} = \sum y_i^2 - \frac{\left( \sum y_i \right)^2}{n}.[/latex]
The correlation coefficient [latex]r[/latex] measures the association between the response variable [latex]y[/latex] and the predictor variable [latex]x[/latex] in the following three aspects:
- Pattern: The correlation coefficient [latex]r[/latex] measures linear association. Do NOT use the correlation coefficient [latex]r[/latex] to describe non-linear association.
- Strength: The closer [latex]r[/latex] is to either +1 or -1, the stronger the linear association. When [latex]r = \pm 1[/latex], [latex]y[/latex] and [latex]x[/latex] have a perfect linear association. That is, all the data points in the scatter plot of [latex]x[/latex] versus [latex]y[/latex] fall in a straight line.
- Direction: Positive or negative. Positive association [latex](r > 0)[/latex] means that [latex]y[/latex] and [latex]x[/latex] change in the same direction. That is [latex]y[/latex] increases (decreases) if [latex]x[/latex] increases (decreases). Negative association [latex](r < 0)[/latex] means that [latex]y[/latex] and [latex]x[/latex] change in the opposite direction, that is, [latex]y[/latex] increases (decreases) if [latex]x[/latex] decreases (increases).
Exercise: Correlation Coefficient
Match the following correlation coefficients with the scatter plots.
(1) 0.989 (2) 0.697 (3) -0.887 (4) -0.020
(a) | (b) | (c) | (d) |
Figure 13:7: Match Correlation Coefficients and Scatter Plots. [Image Description (See Appendix D Figure 13.7)] |
Show/Hide Answer
Answers:
- [latex]r=-0.020[/latex]. There is no linear association between [latex]y[/latex] and [latex]x[/latex], [latex]r[/latex] should be close to 0.
- [latex]r=0.697[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] also increases. There should be a positive linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r > 0[/latex]. But it is not extremely strong since the points show little semblance of a straight line.
- [latex]r=-0.887[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] decreases. There should be a negative linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r < 0[/latex]. The association is quite strong since the points are starting to resemble the rough appearance of a straight line.
- [latex]r=0.989[/latex]. When [latex]x[/latex] increases, [latex]y[/latex] also increases. There should be a positive linear association between [latex]y[/latex] and [latex]x[/latex], i.e., [latex]r > 0[/latex]. The association is extremely strong since the points are basically on a straight line.
Exercise: Concepts on Correlation Coefficient
Explain whether the following statements are true or false. Correct them if they are false.
- If [latex]r \approx 0[/latex], there is no association between [latex]y[/latex] and [latex]x[/latex].
- The larger the value of [latex]r[/latex], the stronger the association between [latex]y[/latex] and [latex]x[/latex].
Show/Hide Answer
- False. If [latex]r \approx 0[/latex], there is no linear association between[latex]y[/latex] and [latex]x[/latex].
- False. The larger the absolute value of [latex]r[/latex], the stronger the linear association. Or the closer [latex]r[/latex] is to +1 or -1, the stronger the linear association.