4.4 Mean and Standard Deviation of a Discrete Variable

Given the probability distribution of a discrete random variable X, we are able to calculate the mean, variance, and standard deviation.

The mean of a discrete variable X can be calculated as

[latex]\mu = \sum xP(X=x),[/latex]

which is a weighted average over all possible values of X. Each possible value x is weighted by its probability [latex]P(X=x)[/latex]. The mean is also called the expected value or the expectation of X. Note that a probability distribution can be viewed as the relative frequency distribution of some population. In this regard, the mean of a discrete random variable is equivalent to the population mean.

Some students might find it easier to find the mean by constructing a working table (see the following example).

Example: Mean of a Discrete Variable

Mark has no siblings, John has one sibling, both Rebecca and Sarah have two siblings, and Mary has three. Randomly pick one student and let X be the number of siblings the student has. Find the mean (expected value) of X.

We calculate the mean of X by constructing a working table. The first two columns of the table give the probability distribution of X and, in each row, the value in the third column is the product of the first two values.

Table 4.3: Working Table for the Mean (Expected Value) of a Discrete Variable

Sum
[latex]\sum P(X=x) = 1.0[/latex]
[latex]\sum x P(X=x) = 1.6[/latex]
[latex]x[/latex]
[latex]P(X=x)[/latex]
[latex]xP(X=x)[/latex]
0
0.2
0 x 0.2 = 0
1
0.2
1 x 0.2 = 0.2
2
0.4
2 x 0.4 = 0.8
3
0.2
3 x 0.2 = 0.6

Taking the sum of the values in the last column gives the mean (expected value) of X, i.e.,

[latex]\begin{align*} \mu &= \sum x P(X=x) \\ &= 0 + 0.2 + 0.8 + 0.6 \\ &= 1.6. \end{align*}[/latex]

Interpretation: On average, each of those five students has 1.6 siblings.

Even though the random variable X = # of siblings can only take integer values, we should keep the decimal place for the mean of X. That is, do not round the mean 1.6 to 2. To demonstrate why we keep the mean at 1.6, let us suppose that this probability distribution describes a much larger population of students. Although it is counterintuitive to say that we expect a student to have 1.6 siblings, it is quite natural to say that we expect 10 students to have a total of 16 siblings, 100 students to have a total of 160 siblings, and so on. If we sample the entire population of students, then the combined number of siblings is 1.6 times greater than the number of students. Hence, the average number of siblings per student is 1.6.

Here we explain why the population mean is given by [latex]\mu = \sum xP(X=x)[/latex]. Suppose there are [latex]N=5[/latex] students, one has no siblings [latex](x_1 = 0)[/latex], one has one sibling [latex](x_2 = 1)[/latex], two have two siblings [latex](x_3 = x_4 =2)[/latex], and one has three siblings [latex](x_5 =3)[/latex]. Recall that the population mean [latex]\mu[/latex] is calculated as:

[latex]\begin{align*}\mu&=\frac{\sum x_i}{N}=\frac{x_1+x_2+x_3+x_4+x_5}{5}=\frac{0+1+2+2+3}{5}=\frac{{\color{blue} 0}\times {\color{red} 1}+{\color{blue} 1}\times {\color{red} 1}+{\color{blue} 2}\times{\color{red} 2}+{\color{blue} 3}\times {\color{red} 1}}{5}\\&={\color{blue} 0} \times {\color{red} \frac{1}{5}}+{\color{blue} 1}\times {\color{red} \frac{1}{5}}+{\color{blue} 2}\times {\color{red} \frac{2}{5}}+{\color{blue} 3}\times {\color{red} \frac{1}{5}}\\&={\color{blue} 0}\times {\color{red} P(X=0)}+{\color{blue} 1}\times {\color{ red} P(X=1)}+{\color{blue} 2}\times{\color{red} P(X=2)}+{\color{blue} 3}\times{\color{red} P(X=3)}\\&=\sum {\color{blue} x} {\color{red} P(X=x)}.\end{align*}[/latex]

Similarly, the variance of a discrete variable X can be calculated as

 

[latex]\sigma^2=\underbrace{\sum (x-\mu)^2 P(X=x)}\limits_{\mbox{defining formula}}.[/latex]

which is a weighted average of the squared distance from each value [latex]x[/latex] to the population mean [latex]\mu[/latex], weighted by its probability [latex]P(X=x)[/latex] (relative frequency). It can be shown that

[latex]\sigma^2=\sum (x-\mu)^2 P(X=x)=\underbrace{\sum x^2P(X=x)-\mu^2}\limits_{\mbox{computing formula}}.[/latex]

Taking the square root of the variance [latex]\sigma^2[/latex] gives the standard deviation [latex]\sigma[/latex]

[latex]\sigma=\sqrt{\sigma^2}=\underbrace{\sqrt{\sum (x-\mu)^2 P(X=x)}}\limits_{\mbox{defining formula}}=\underbrace{\sqrt{\sum x^2 P(X=x)-\mu^2}}\limits_{\mbox{computing formula}}.[/latex]

Example: Standard Deviation of a Discrete Variable

Mark has no siblings, John has one sibling, both Rebecca and Sarah have two siblings, and Mary has three. Randomly pick one student and let X be the number of siblings the student has. Find the standard deviation of X.

We can find the standard deviation using a working table:

Table 4.4: Standard Deviation Using Computing Formula

[latex]x[/latex] [latex]P(X=x)[/latex] [latex]x^2[/latex] [latex]x^2P(X=x)[/latex]
0 0.2 02=0 0 x 0.2 = 0
1 0.2 12=1 1 x 0.2 = 0.2
2 0.4 22=4 4 x 0.4 = 1.6
3 0.2 32=9 9 x 0.2 = 1.8
Sum [latex]\sum P(X=x) = 1.0[/latex] [latex]\sum x^2 P(X=x) = 3.6[/latex]

The standard deviation of X is [latex]\sigma = \sqrt{\sum x^2 P(X = x) - \mu^2} = \sqrt{3.6 - 1.6^2} = \sqrt{1.04} = 1.02.[/latex]

Interpretation: Roughly speaking, on average, the number of siblings of those five students is 1.02 away from the mean 1.6.

Exercise: Discrete Random Variable and Its Probability Distribution

For one insurance policy, the company pays out $10,000 if the customer dies, $5,000 if the customer is disabled and $0 for other situations. Suppose the probability of death is 0.001 and the probability of being disabled is 0.002. Let X be the amount of money the company pays.

  1. Find the probability distribution of X. Complete the following table:
 [latex]x($)[/latex] [latex]P(X=x)[/latex]
10000
5000
  1. Find the mean (expected value) of X.
  2. Find the standard deviation of X.
  3. Suppose the company wants to make an average profit of $50 per customer. Calculate the premium it should charge each customer.

Exercise: Mean and Standard Deviation of a Discrete Random Variable

Let [latex]X[/latex] be the number of patients arriving at an emergency centre from 9 to 9:30 PM. The probability
distribution of [latex]X[/latex] is given in the following table.

[latex]x[/latex] 0 1 2
[latex]P (X = x)[/latex] 0.3 4a 3a
  1. Find the value of a.
  2. Find the mean of X.
  3. Find the standard deviation of X.
Show/Hide Answer
  1. Since the sum of probabilities [latex]P(X=x)[/latex] is one, [latex]0.3 + 4a + 3a = 1 \Longrightarrow 7a = 0.7 \Longrightarrow a = 0.1.[/latex]
  2. The mean is [latex]\mu = \sum xP(X=x)=0 \times 0.3 + \times 0.4 + 2 \times 0.3 = 1[/latex].
  3. The standard deviation is given by:

[latex]\begin{align*} \sigma&= \sqrt{ \sum x^2 P(X=x) - \mu^2 } = \sqrt{ ( 0^2 \times 0.3 + 1^2 \times 0.4 + 2^2 \times 0.3 ) - 1^2 }\\ &=\sqrt{1.6-1^2}= \sqrt{ 0.6 } = 0.775 . \end{align*}[/latex]

It might be helpful to construct the working table below:

[latex]x[/latex] [latex]P(X=x)[/latex] [latex]xP(X=x)[/latex] [latex]x^2 P(X=x)[/latex]
0 0.3 0×0.3=0 [latex]0^2\times 0.3=0[/latex]
1 0.4 1×0.4=0.4 [latex]1^2\times 0.4=0.4[/latex]
2 0.3 2×0.3=0.6 [latex]2^2\times 0.3=1.2[/latex]
Sum 1.0 [latex]\sum xP(X=x)=1.0[/latex] [latex]\sum x^2P(X=x)=1.6[/latex]

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.