3.9 Contingency Table: Joint and Marginal Probability
Recall the contingency table in the example of association between breast cancer and smoking:
Table 3.3: Contingency Table of “Cancer Status” and “Smoking Status”
Smoker (S) | Non-smoker (not S) | Total | |
---|---|---|---|
Breast Cancer (B) | 10 (B & S) | 30 (B & not S) | 40 (B) |
Cancer Free (not B) | 20 (not B & S) | 140 (not B & not S) | 160 (not B) |
Total | 30 (S) | 170 (not S) | 200 |
The row variable is “Cancer Status” with two possible values: breast cancer or cancer free. The column variable is “Smoking Status” with two possible values: smoker and non-smoker.
The marginal probabilities are the row or column totals divided by the grand total. For the current example, the marginal probabilities are:
[latex]P(B) = \frac{40}{200} = 0.2, \quad P(\text{not }B) = \frac{160}{200} = 0.8;[/latex]
[latex]P(S) = \frac{30}{200} = 0.15, \quad P(\text{not }S) = \frac{170}{200} = 0.85.[/latex]
Note that [latex]P(B)=0.2[/latex] and [latex]P(\mbox{not } B)=0.8[/latex] give the marginal probability distribution of the row variable “Cancer Status” and they add up to 1. Similarly, [latex]P(S)=0.15[/latex] and [latex]P(\mbox{not } S)=0.85[/latex] give the marginal probability distribution of the column variable “Smoking Status” and they sum to 1.
The joint probabilities are the frequencies in the cells divided by the grand total. For the current example, the joint probabilities are:
[latex]P(B \: \& \: S) = \frac{10}{200}=0.05, \quad P(B\: \& \text{ not }S) = \frac{30}{200} = 0.15,[/latex]
[latex]P(\text{not }B \: \& \: S) = \frac{20}{200}=0.1, \quad P(\text{not }B \: \& \text{ not }S) = \frac{140}{200} = 0.7.[/latex]