10.2 Distribution of the Sample Proportion
Inferences about the population mean
The population proportion is defined as
The population proportion can be regarded as a special type of population mean if we let the variable of interest be an indicator variable as follows:
Then, the population proportion can be rewritten as
The variable of interest
Table 10.1: Probability Distribution of an Indicator Variable
|
1
|
0
|
---|---|---|
|
|
|
with a population mean and population standard deviation:
The sample proportion can be viewed as a special type of sample mean (in the same way that the population proportion can be viewed as a special type of population mean). That is, in a simple random sample of size n, the proportion of individuals with the specific attribute is the sample proportion:
with
Recall from Chapter 6, the sampling distribution of the sample mean
- Centre: the mean of the sample mean
equals the population mean . That is, - Spread: the standard deviation of the sample mean equals the population standard deviation divided by the square root of the sample size. That is,
These two arguments are true for any population distribution and sample size n.
- Shape:
- When the population distribution is normal,
is also normal regardless of n. - When the population distribution is non-normal but the sample size n is large,
is approximately normally distributed. This is guaranteed by the central limit theorem (CLT).
- When the population distribution is normal,
The same conclusions can be applied to the sampling distribution of the sample proportion
with the population mean
Key Facts: Sampling Distribution of the Sample Proportion
- Centre: the mean of the sample proportion
equals the population mean . That is, . - Spread: the standard deviation of the sample proportion
equals the population standard deviation divided by the square root of the sample size. That is, .
These two arguments are true for any population proportion
- Shape: The population distribution is non-normal. By the central limit theorem (CLT), however,
is approximately normal if n is large enough. The rule of thumb is to guarantee both and , i.e., . Some textbooks require both and .
Central limit theorem for the sample proportion:
If the sample size n is large enough (
For example, suppose the population proportion is
The following figures show the sampling distribution of the sample proportion with
![]() |
![]() |
![]() |
![]() |
Figure 10.1: Histograms of Sample Proportions with Different Sample Size. [Image Description (See Appendix D Figure 10.1)] Click on the image to enlarge it.
|
There are several findings:
- The sampling distribution of the sample proportion becomes increasingly normal as the sample size n increases. When n = 50, the sampling distribution of sample proportion is skewed. When n = 100, the distribution is still slightly right skewed. For n = 200 and n = 1000, the sampling distribution appears bell-shaped and symmetric (indicative of a normal distribution).
- The mean of the sample proportion (blue dashed line) is always identical to the population proportion p = 0.05 (red solid line) regardless of the sample size n.
- The standard deviation of the sample proportion decreases as n increases.
To summarize, for