4.4 Mean and Standard Deviation of a Discrete Variable
Given the probability distribution of a discrete random variable X, we are able to calculate the mean, variance, and standard deviation.
The mean of a discrete variable X can be calculated as
[latex]\mu = \sum xP(X=x),[/latex]
which is a weighted average over all possible values of X. Each possible value x is weighted by its probability [latex]P(X=x)[/latex]. The mean is also called the expected value or the expectation of X. Note that a probability distribution can be viewed as the relative frequency distribution of some population. In this regard, the mean of a discrete random variable is equivalent to the population mean.
Some students might find it easier to find the mean by constructing a working table (see the following example).
Example: Mean of a Discrete Variable
Mark has no siblings, John has one sibling, both Rebecca and Sarah have two siblings, and Mary has three. Randomly pick one student and let X be the number of siblings the student has. Find the mean (expected value) of X.
We calculate the mean of X by constructing a working table. The first two columns of the table give the probability distribution of X and, in each row, the value in the third column is the product of the first two values.
Table 4.3: Working Table for the Mean (Expected Value) of a Discrete Variable
Sum

[latex]\sum P(X=x) = 1.0[/latex]

[latex]\sum x P(X=x) = 1.6[/latex]

[latex]x[/latex]

[latex]P(X=x)[/latex]

[latex]xP(X=x)[/latex]

0

0.2

0 x 0.2 = 0

1

0.2

1 x 0.2 = 0.2

2

0.4

2 x 0.4 = 0.8

3

0.2

3 x 0.2 = 0.6

Taking the sum of the values in the last column gives the mean (expected value) of X, i.e.,
[latex]\begin{align*} \mu &= \sum x P(X=x) \\ &= 0 + 0.2 + 0.8 + 0.6 \\ &= 1.6. \end{align*}[/latex]
Interpretation: On average, each of those five students has 1.6 siblings.
Even though the random variable X = # of siblings can only take integer values, we should keep the decimal place for the mean of X. That is, do not round the mean 1.6 to 2. To demonstrate why we keep the mean at 1.6, let us suppose that this probability distribution describes a much larger population of students. Although it is counterintuitive to say that we expect a student to have 1.6 siblings, it is quite natural to say that we expect 10 students to have a total of 16 siblings, 100 students to have a total of 160 siblings, and so on. If we sample the entire population of students, then the combined number of siblings is 1.6 times greater than the number of students. Hence, the average number of siblings per student is 1.6.
Here we explain why the population mean is given by [latex]\mu = \sum xP(X=x)[/latex]. Suppose there are [latex]N=5[/latex] students, one has no siblings [latex](x_1 = 0)[/latex], one has one sibling [latex](x_2 = 1)[/latex], two have two siblings [latex](x_3 = x_4 =2)[/latex], and one has three siblings [latex](x_5 =3)[/latex]. Recall that the population mean [latex]\mu[/latex] is calculated as:
[latex]\begin{align*}\mu&=\frac{\sum x_i}{N}=\frac{x_1+x_2+x_3+x_4+x_5}{5}=\frac{0+1+2+2+3}{5}=\frac{{\color{blue} 0}\times {\color{red} 1}+{\color{blue} 1}\times {\color{red} 1}+{\color{blue} 2}\times{\color{red} 2}+{\color{blue} 3}\times {\color{red} 1}}{5}\\&={\color{blue} 0} \times {\color{red} \frac{1}{5}}+{\color{blue} 1}\times {\color{red} \frac{1}{5}}+{\color{blue} 2}\times {\color{red} \frac{2}{5}}+{\color{blue} 3}\times {\color{red} \frac{1}{5}}\\&={\color{blue} 0}\times {\color{red} P(X=0)}+{\color{blue} 1}\times {\color{ red} P(X=1)}+{\color{blue} 2}\times{\color{red} P(X=2)}+{\color{blue} 3}\times{\color{red} P(X=3)}\\&=\sum {\color{blue} x} {\color{red} P(X=x)}.\end{align*}[/latex]
Similarly, the variance of a discrete variable X can be calculated as
[latex]\sigma^2=\underbrace{\sum (x\mu)^2 P(X=x)}\limits_{\mbox{defining formula}}.[/latex]
which is a weighted average of the squared distance from each value [latex]x[/latex] to the population mean [latex]\mu[/latex], weighted by its probability [latex]P(X=x)[/latex] (relative frequency). It can be shown that
[latex]\sigma^2=\sum (x\mu)^2 P(X=x)=\underbrace{\sum x^2P(X=x)\mu^2}\limits_{\mbox{computing formula}}.[/latex]
Taking the square root of the variance [latex]\sigma^2[/latex] gives the standard deviation [latex]\sigma[/latex]
[latex]\sigma=\sqrt{\sigma^2}=\underbrace{\sqrt{\sum (x\mu)^2 P(X=x)}}\limits_{\mbox{defining formula}}=\underbrace{\sqrt{\sum x^2 P(X=x)\mu^2}}\limits_{\mbox{computing formula}}.[/latex]
Example: Standard Deviation of a Discrete Variable
Mark has no siblings, John has one sibling, both Rebecca and Sarah have two siblings, and Mary has three. Randomly pick one student and let X be the number of siblings the student has. Find the standard deviation of X.
We can find the standard deviation using a working table:
Table 4.4: Standard Deviation Using Computing Formula
[latex]x[/latex]  [latex]P(X=x)[/latex]  [latex]x^2[/latex]  [latex]x^2P(X=x)[/latex] 
0  0.2  0^{2}=0  0 x 0.2 = 0 
1  0.2  1^{2}=1  1 x 0.2 = 0.2 
2  0.4  2^{2}=4  4 x 0.4 = 1.6 
3  0.2  3^{2}=9  9 x 0.2 = 1.8 
Sum  [latex]\sum P(X=x) = 1.0[/latex]  [latex]\sum x^2 P(X=x) = 3.6[/latex] 
The standard deviation of X is [latex]\sigma = \sqrt{\sum x^2 P(X = x)  \mu^2} = \sqrt{3.6  1.6^2} = \sqrt{1.04} = 1.02.[/latex]
Interpretation: Roughly speaking, on average, the number of siblings of those five students is 1.02 away from the mean 1.6.
Exercise: Discrete Random Variable and Its Probability Distribution
For one insurance policy, the company pays out $10,000 if the customer dies, $5,000 if the customer is disabled and $0 for other situations. Suppose the probability of death is 0.001 and the probability of being disabled is 0.002. Let X be the amount of money the company pays.
 Find the probability distribution of X. Complete the following table:
[latex]x($)[/latex]  [latex]P(X=x)[/latex] 
10000  
5000  
 Find the mean (expected value) of X.
 Find the standard deviation of X.
 Suppose the company wants to make an average profit of $50 per customer. Calculate the premium it should charge each customer.
Show/Hide Answer
 The probability distribution consists of two components: possible values and probabilities.
[latex]x($)[/latex]  [latex]P(X=x)[/latex] 
10000  0.001 
5000  0.002 
0  0.997 
 Based on the working table below, the mean is calculated as [latex]\mu = \sum xP(X=x) = 10+10 + 0 = 20[/latex].
Sum  [latex]\sum P(X=x) = 1.000[/latex]  [latex]\sum xP(X=x) =20[/latex] 
[latex]x($)[/latex]  [latex]P(X=x)[/latex]  [latex]xP(X=x)[/latex] 
10000  0.001 
10000 x 0.001 = 10

5000  0.002 
5000 x 0.002 = 10

0  0.997 
0 x 0.997 = 0

Interpretation: On average, the company pays out $20 for each customer.
 Based on the working table below, the standard deviation is given by
[latex]\sigma = \sqrt{ \sum x^2 P(X=x)  \mu^2 } = \sqrt{ 150000 20^2 } = \sqrt{ 149600 } = 386.78.[/latex]
[latex]\sum P(X=x)= 1.000[/latex]  [latex]\sum x^2 P(X=x) = 150000[/latex]  
[latex]x($)[/latex]  [latex]P(X=x)[/latex]  [latex]x^2P(X=x)[/latex] 
10000  0.001 
10000 ^{2} × 0.001 = 100000

5000  0.002 
5000 ^{2} × 0.002 = 50000

0  0.997 
0^{2} × 0.997 = 0

Interpretation: Roughly speaking, on average, the payout is $386.78 different from the mean $20.
Note: The standard deviation in this example does not have much of a practical meaning. Compared to the expected value, the standard deviation is large, indicating a large variation in the payout. This is due to the fact that the payout is one of 0, 5000 or 10,000 and majority of customers will receive 0 payout.
 On average the company pays out $20 for customer. If the company wants to make an average profit of $50 per customer, it must ask for $20 more on the top of $50. Therefore, the company should charge 50 + 20 = 70 dollars per customer.
Exercise: Mean and Standard Deviation of a Discrete Random Variable
Let [latex]X[/latex] be the number of patients arriving at an emergency centre from 9 to 9:30 PM. The probability
distribution of [latex]X[/latex] is given in the following table.
[latex]x[/latex]  0  1  2 
[latex]P (X = x)[/latex]  0.3  4a  3a 
 Find the value of a.
 Find the mean of X.
 Find the standard deviation of X.
Show/Hide Answer
 Since the sum of probabilities [latex]P(X=x)[/latex] is one, [latex]0.3 + 4a + 3a = 1 \Longrightarrow 7a = 0.7 \Longrightarrow a = 0.1.[/latex]
 The mean is [latex]\mu = \sum xP(X=x)=0 \times 0.3 + \times 0.4 + 2 \times 0.3 = 1[/latex].
 The standard deviation is given by:
[latex]\begin{align*} \sigma&= \sqrt{ \sum x^2 P(X=x)  \mu^2 } = \sqrt{ ( 0^2 \times 0.3 + 1^2 \times 0.4 + 2^2 \times 0.3 )  1^2 }\\ &=\sqrt{1.61^2}= \sqrt{ 0.6 } = 0.775 . \end{align*}[/latex]
It might be helpful to construct the working table below:
[latex]x[/latex]  [latex]P(X=x)[/latex]  [latex]xP(X=x)[/latex]  [latex]x^2 P(X=x)[/latex] 
0  0.3  0×0.3=0  [latex]0^2\times 0.3=0[/latex] 
1  0.4  1×0.4=0.4  [latex]1^2\times 0.4=0.4[/latex] 
2  0.3  2×0.3=0.6  [latex]2^2\times 0.3=1.2[/latex] 
Sum  1.0  [latex]\sum xP(X=x)=1.0[/latex]  [latex]\sum x^2P(X=x)=1.6[/latex] 