11.3 Chi-Square Goodness-of-Fit Test
The chi-square goodness-of-fit test can be applied to either a categorical or discrete quantitative variable with a finite number of values. The objective of the chi-square goodness-of-fit test is to test whether the variable does not follow the probability distribution specified in the null hypothesis [latex]H_0[/latex].
The main idea behind the chi-square goodness-of-fit test is to compare the observed frequencies (O) to the expected frequencies ([latex]E[/latex]), which are based on the probability distribution specified in [latex]H_0[/latex]. If [latex]H_0[/latex] is true, the observed and expected frequencies should be reasonably similar. Therefore, we reject [latex]H_0[/latex] if the observed and expected frequencies are very different. The discrepancy between the observed and expected frequencies can be quantified by chi-square statistic
[latex]\chi^2 = \sum_{\text{all cells}} \frac{(O - E)^2}{E}[/latex]
which follows a chi-square distribution with [latex]df = k-1[/latex], where [latex]k[/latex] is the number of possible values for the variable under consideration. The chi-square statistic will be large when the observed and expected frequencies are very different. Thus, we reject the null hypothesis when the chi-square statistic is sufficiently large. More specifically, at the significance level of [latex]\alpha[/latex], we reject [latex]H_0[/latex] if the chi-square statistic is larger than the critical value [latex]\chi_{\alpha}^2[/latex]. Since we only reject [latex]H_0[/latex] if the chi-square statistic is sufficiently large, chi-square tests are always right-tailed. That is, both the rejection region and the p-value are upper-tailed probabilities.
Chi-Square Goodness-of-Fit Test
Assumptions:
- All expected frequencies are at least 1.
- At most 20% of the expected frequencies are less than 5.
- Simple random sample (if you need to generalize the conclusion to a larger population).
Note: If assumptions 1 or 2 are violated, one can consider combining the cells to increase the counts in those cells.
Steps to perform a chi-square goodness-of-fit test:
First, check the assumptions. Calculate the expected frequency for each possible value of the variable using [latex]E=np[/latex], where [latex]n[/latex] is the total number of observations and [latex]p[/latex] is the relative frequency (or probability) specified in the null hypothesis. Check whether the expected frequencies satisfy assumptions 1 and 2. If not, consider combining some cells.
- Set up the hypotheses:
[latex]\begin{align*} H_0 &: \text{The variable has the specified distribution }\\ H_a &: \text{The variable does not have the specified distribution}. \end{align*}[/latex] - State the significance level [latex]\alpha[/latex].
- Compute the value of the test statistic: [latex]\chi_o^2 = \sum_{\text{all cells}} \frac{(O - E)^2}{E}[/latex] with [latex]df = k-1[/latex].
- Find the P-value or rejection region based on the [latex]\chi^2[/latex] curve with [latex]df = k-1[/latex].
Rejection region [latex]\chi^2 \geq \chi_{\alpha}^2[/latex] the region to the right of [latex]\chi_{\alpha}^2[/latex], the area is [latex]\alpha[/latex] P-value [latex]P(\chi^2 \geq \chi_o^2)[/latex] the area to the right of [latex]\chi_o^2[/latex] under the curve - Reject the null [latex]H_0[/latex] if P-value [latex]\leq \alpha[/latex] or [latex]\chi_o^2[/latex] falls in the rejection region.
- Conclusion.
Example: Chi-Square Goodness-of-Fit Test
According to the results of the federal election in 2015, 31.9% of votes supported the Conservative Party, 39.5% supported the Liberal Party, 19.7% supported the New Democratic Party (NDP), 4.7% supported Bloc Québécois, and 3.4% supported the Green Party (data from Wikipedia). Thirty-seven students in my Stat151 class responded to an online survey and their preferences are summarized in the following table:
Table 11.2: Voting Preference of the Class
Conservative
|
Green
|
Liberal
|
NDP
|
Not Voting
|
Others
|
---|---|---|---|---|---|
9
|
2
|
17
|
6
|
3
|
0
|
Test at the 5% significance level whether the class had different voting preferences than all Canadians in the 2015 election.
Check the assumptions: since [latex]n = 37[/latex], each expected frequency is computed as [latex]E = np = 37 \times p[/latex]. For example, the expected count of conservative voters is [latex]E = 37 \times 0.319 = 11.803[/latex]. The following table gives all expected counts:
Table 11.3: Expected Frequency of Voting Preference
Conservative | Green | Liberal | NDP | Bloc Québécois | Others | |
---|---|---|---|---|---|---|
Proportion [latex](p)[/latex] |
0.319
|
0.034
|
0.395
|
0.197
|
0.047
|
0.008
|
Counts |
11.803
|
1.258
|
14.615
|
7.289
|
1.739
|
0.296
|
There are [latex]k = 6[/latex] cells and at most [latex]6 \times 0.2 = 1.2[/latex] cells are expected to have expected counts less than 5; however, there are actually three cells less than 5. We could combine the cells “Green”, “Bloc Québécois” and “Others”, and name it as “Others”. Therefore, we have the working table as follows.
Table 11.4: Working Table for a Chi-Square Goodness of Fit Test (Example)
[latex]\text{Sum}=1[/latex]
|
[latex]\text{Sum}=37[/latex]
|
[latex]\text{Sum}=37[/latex]
|
[latex]\text{Sum}=\chi_o^2 = 2.1667[/latex]
|
|
Parties
|
Proportion [latex]p[/latex]
|
Observed [latex]O[/latex]
|
Expected
[latex]E = np = 37 \times p[/latex] |
[latex]\frac{(O - E)^2}{E}[/latex]
|
---|---|---|---|---|
Conservative
|
[latex]0.319[/latex]
|
[latex]9[/latex]
|
[latex]37 \times 0.319=11.803[/latex]
|
[latex]\frac{(9 - 11.803)^2}{11.803} = 0.6657[/latex]
|
Liberal
|
[latex]0.395[/latex]
|
[latex]17[/latex]
|
[latex]37 \times 0.395=14.615[/latex]
|
[latex]\frac{(17 - 14.615)^2}{14.615} = 0.3892[/latex]
|
NDP |
[latex]0.197[/latex]
|
[latex]6[/latex]
|
[latex]37 \times 0.197=7.289[/latex]
|
[latex]\frac{(6 - 7.289)^2}{7.289} = 0.2279[/latex]
|
Others
|
[latex]0.089[/latex]
|
[latex]2+3+0=5[/latex]
|
[latex]37 \times 0.089=3.293[/latex]
|
[latex]\frac{(5 - 3.293)^2}{3.293} = 0.8849[/latex]
|
Note: After combining the cells, all the expected counts are greater than 1, while 25% of the expected counts are below 5 (the expected count for Others is below 5). Since more than 20% of the expected counts are below 5, there is still a violation in the assumptions. However, the expected frequency for “Others” is 3.293 which is not very far away from 5. To maintain a meaningful number of parties, we proceed to conduct the chi-square goodness-of-fit test.
Steps to perform a chi-square goodness-of-fit test:
- Set up the hypotheses: [latex]\begin{align*} H_0 & : p_{\scriptsize C} = 0.319, p_{\scriptsize L} = 0.395, p_{\scriptsize NDP} = 0.197, p_{\scriptsize Others} = 0.089 \\ H_a & : \text{At least one proportion is different from those specified in } H_0. \end{align*}[/latex]
- The significance level is [latex]\alpha = 0.05[/latex].
- The test statistic: [latex]\chi_o^2 = \sum_{\text{all cells}} \frac{(O- E)^2}{E} = 2.1677[/latex], with [latex]df = k -1 = 4 - 1 =3[/latex].
- Find the P-value. Since chi-square tests are always right-tailed, the p-value is
P-value [latex]= P(\chi^2 \geq \chi_o^2) = P(\chi^2 \geq 2.1677) \: \gt \: 0.1[/latex]. - Decision: We do not reject the null [latex]H_0[/latex], since P-value [latex]\: \gt \: 0.1 \: \gt \: 0.05(\alpha)[/latex].
- Conclusion: At the 5% significance level, we do not have sufficient evidence that the class had different voting preferences than all Canadians in the 2015 election.
If using the critical value approach, steps 4–6 are as follows:
- Find the rejection region. For a right-tailed test with [latex]df=3[/latex], the rejection region is to the right of the critical value [latex]\chi^2 \geq \chi_{\alpha}^2 = \chi_{0.05}^2 = 7.815[/latex].
- Decision: We do not reject the null [latex]H_0[/latex] since [latex]\chi_o^2 = 2.1667 < 7.815[/latex] falls in the non-rejection region.
- Conclusion: At the 5% significance level, we do not have sufficient evidence that the class had different voting preferences than all Canadians in the 2015 election.
Exercise: Chi-square goodness-of-fit test
A company claims their deluxe mixed nuts consist of 20% peanuts, 60% cashews, and 20% almonds. An inspector obtains a random sample of [latex]n = 100[/latex] nuts and observes 30 peanuts, 55 cashews, and 15 almonds. Test at the 5% significance level whether the percentages differ from what the company claims.
Show/Hide Answer
Answers:
Check the assumptions: [latex]n = 100[/latex] and the expected counts are
[latex]E_{\text{peanut}} = 100 \times 0.2 = 20, E_{\text{cashew}} = 100 \times 0.6 = 60,[/latex] [latex]E_{\text{almond}} = 100 \times 0.2 = 20[/latex] and all greater than 5.
Steps to perform a chi-square goodness-of-fit test:
- Set up the hypotheses:
[latex]\begin{align*} H_0 &: p_{\text{peanut}} = 0.2, p_{\text{cashew}} = 0.6, p_{\text{almond}} = 0.2 \\ H_a &: \text{at least one proportion is different from those specified in } H_0. \end{align*}[/latex] - The significance level is [latex]\alpha = 0.05[/latex].
- The test statistic with the working table:
Table 11.5: Working Table for Chi-Square Goodness-of-Fit Test (Exercise)
[latex]\text{Sum}=1[/latex][latex]\text{Sum}=100[/latex][latex]\text{Sum}=100[/latex][latex]\text{Sum}= \chi_o^2 = 6.667[/latex]Nuts Proportion
pObserved
(O)Expected
[latex]E = np = 100 \times p[/latex][latex]\frac{(O-E)^2}{E}[/latex] Peanut[latex]0.2[/latex][latex]30[/latex][latex]100 \times 0.2 = 20[/latex][latex]\frac{(30 - 20)^2}{20} = 5.000[/latex]Cashew[latex]0.6[/latex][latex]55[/latex][latex]100 \times 0.6 = 60[/latex][latex]\frac{(55 - 60)^2}{60} = 0.417[/latex]Almond[latex]0.2[/latex][latex]15[/latex][latex]100 \times 0.2 = 20[/latex][latex]\frac{(15 - 20)^2}{20} = 1.250[/latex][latex]\chi_o^2 = \sum_{\text{all cells}} \frac{(O - E)^2}{E} = 6.667[/latex] with [latex]df = k - 1 = 3-1 =2[/latex].
- Find the P-value: P-value [latex]P(\chi^2 \geq \chi_o^2) = P(\chi^2 \geq 6.667)[/latex].
Since [latex]5.991 (\chi_{0.05}^2) < \chi_o^2=6.667 < 7.378 (\chi_{0.025}^2)[/latex], 0.025 < P-value < 0.05. - Decision: We should reject the null [latex]H_0[/latex] since P-value <0.05([latex]\alpha[/latex]).
- Conclusion: At the 5% significance level, we have sufficient evidence that the percentages of nuts are different from what the company claims.