3.9 Contingency Table: Joint and Marginal Probability

Recall the contingency table in the example of association between breast cancer and smoking:

Table 3.3: Contingency Table of “Cancer Status” and “Smoking Status”

Smoker (S) Non-smoker (not S) Total
Breast Cancer (B) 10 (B & S) 30 (B & not S) 40  (B)
Cancer Free (not B) 20 (not B & S) 140 (not B & not S) 160  (not B)
Total 30 (S) 170 (not S) 200

The row variable is “Cancer Status” with two possible values: breast cancer or cancer free. The column variable is “Smoking Status” with two possible values: smoker and non-smoker.

The marginal probabilities are the row or column totals divided by the grand total. For the current example, the marginal probabilities are:

[latex]P(B) = \frac{40}{200} = 0.2, \quad P(\text{not }B) = \frac{160}{200} = 0.8;[/latex]

[latex]P(S) = \frac{30}{200} = 0.15, \quad P(\text{not }S) = \frac{170}{200} = 0.85.[/latex]

Note that [latex]P(B)=0.2[/latex] and [latex]P(\mbox{not } B)=0.8[/latex] give the marginal probability distribution of the row variable “Cancer Status” and they add up to 1. Similarly, [latex]P(S)=0.15[/latex] and [latex]P(\mbox{not } S)=0.85[/latex] give the marginal probability distribution of the column variable “Smoking Status” and they sum to 1.

The joint probabilities are the frequencies in the cells divided by the grand total. For the current example, the joint probabilities are:

[latex]P(B \: \& \: S) = \frac{10}{200}=0.05, \quad P(B\: \& \text{ not }S) = \frac{30}{200} = 0.15,[/latex]

[latex]P(\text{not }B \: \& \: S) = \frac{20}{200}=0.1, \quad P(\text{not }B \: \& \text{ not }S) = \frac{140}{200} = 0.7.[/latex]

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.