4.5 Binomial Distribution

A special and useful discrete probability distribution is the binomial distribution. Before introducing binomial distribution, we first introduce Bernoulli trials.

Chance experiments satisfying the following conditions are called Bernoulli trials:

  1. Only two possible outcomes: success and failure.
  2. The probability of success, p, is the same on every trial.
  3. Trials are independent. Observing a head on the current trial won’t affect the result of the next trial. If trials are not strictly independent, it is still okay as long as the sample is less than 10% of the population.
  4. The number of trials is fixed.

4.5.1 Probability Distribution of a Binomial Random Variable

Let [latex]X[/latex] be the number of successes in a sequence of n Bernoulli trials with probability of success p. Then [latex]X[/latex] follows a binomial distribution with the probability distribution given by

[latex]P(X=x) =_nC_x p^x (1-p)^{n-x}, \quad x= 0, 1, ... , n.[/latex]

Note that even though the probability distribution is not given in the table form, it lists all possible values of x and their probabilities.

Example: Binomial Probability Distribution

If we roll a balanced die three times, find the probability of rolling exactly one six.

Let [latex]X[/latex] be the number of six observed, then [latex]X[/latex] follows a binomial distribution with [latex]n=3, p=P(\mbox{observing a six})=\frac{1}{6}[/latex]. The probability of rolling exactly one six, [latex]P(X=1)[/latex], can be calculated using the binomial probability distribution formula with [latex]n=3, p=\frac{1}{6}, x=1[/latex]:

[latex]P(X=1)=_3C_1 (\frac{1}{6})^1 (1-\frac{1}{6})^{3-1}=3 (\frac{1}{6})^1(\frac{5}{6})^2=0.3472.[/latex]

Interpretation of the binomial probability equation is as follows: There are [latex]n[/latex] Bernoulli trials and hence [latex]n[/latex] outcomes that are either a success or a failure. We want to find the probability of observing [latex]x[/latex] successes. If we observe [latex]n[/latex] successes, there must be [latex](n-x)[/latex] failures. The probability of success is [latex]p[/latex] and the probability of failure is [latex](1-p)[/latex]. Hence, the probability of [latex]x[/latex] consecutive successes followed by [latex](n-x)[/latex] failures is [latex]p^x(1-p)^{n-x}[/latex] because the trials are independent and thus the special multiplication rule applies. Next, observe that there are [latex]_nC_x[/latex] ways to rearrange [latex]x[/latex] successes and [latex](n-x)[/latex] failures and so the probability of [latex]x[/latex] successes, disregarding order, is [latex]_nC_xp^x(1-p)^{n-x}[/latex]. The term [latex]_nC_x[/latex] is called the binomial coefficient. For example, if we conduct the trial three times and observe one success, it can occur in the first, second, or third trials. The number of ways to observe one success out of three trials is [latex]_3C_1=3[/latex].

Exercise: Binomial Distribution

Does the following random variable X follow a binomial distribution? If yes, identify the parameters n (total number of Bernoulli trials) and p (probability of success).

  1. Flip a balanced coin six times, and let [latex]X=[/latex] # of heads observed.
  2. A multiple-choice exam has ten questions, each with four answers: A, B, C, and D. I didn’t study and guess the answers. Let [latex]X[/latex] = # of correct answers.
  3. Roll a balanced die 10 times, let [latex]X[/latex] = # of sixes observed.
  4. There are 10 students: 6 female and 4 male. Randomly pick 5 students without replacement, let [latex]X=[/latex] # of female students.
  5. There are 10,000 students: 6,000 female and 4,000 male. Randomly pick 5 students without replacement, let [latex]X=[/latex] # of female students.
  6. A bad basketball player has a 10% chance of making a basket each time he tries. Assume trials are independent. He will continue trying until he has made 2 baskets. Let [latex]X=[/latex] number of trials.

4.5.2 Mean and Standard Deviation of Binomial Distribution

Recall that the general formulas for the mean and the standard deviation of a discrete variable are:

[latex]\mu=\sum xP(X=x), \quad \sigma=\sqrt{\sum x^2P(X=x)-\mu^2}.[/latex]

And the probability distribution of a binomial random variable is given by

[latex]P(X=x)=_nC_x p^x(1-p)^{n-x}, x=0, 1, \cdots, n.[/latex]

Plugging the third equation in the first and second equations gives the mean and standard deviation of a binomial random variable:

 

[latex]\begin{align*} \mu&=\sum xP(X=x)=\sum x [\underbrace{_nC_x \times (p^x)\times (1-p)^{n-x}]}\limits_{\mbox{prob dist'n of binomial}}=np;\\ \sigma&=\sqrt{\sum x^2P(X=x)-\mu^2}=\sqrt{\sum x^2 \underbrace{[(_nC_x) \times (p^x)\times (1-p)^{n-x}]}\limits_{\mbox{prob dist'n of binomial}}-(np)^2}=\sqrt{np(1-p)}. \end{align*}[/latex]

Note that we can use  [latex]\mu = np, \sigma = \sqrt{ np(1-p) }[/latex] to find the mean and standard deviation of a discrete random variable [latex]X[/latex] if and only if [latex]X[/latex] follows a binomial distribution with parameters [latex]n[/latex] and [latex]p[/latex].

4.5.3 Steps to Find Probabilities Related to Binomial Distribution

In order to apply the binomial probability formula, we need to make sure that the variable follows a binomial distribution by checking:

  1. Does each trial in the experiment have only two possible outcomes?
  2. Are the trials independent?
  3. Does each trial have the same probability of success?
  4. Is the number of trials fixed?

If the answers to all the above questions are yes and we perform n trials, let [latex]X=[/latex] # of successes, then we can claim that [latex]X[/latex] follows a binomial distribution. We can apply the binomial probability formula as follows:

  1. Identify the success event.
  2. Determine the probability of success p.
  3. Determine n the total number of trials.
  4. Write down the event of interest in terms of the binomial variable [latex]X[/latex].
  5. Apply the binomial probability formula [latex]P(X=x) = _nC_xp^x(1-p)^{n-x}[/latex] to calculate the probability of each outcome in the event, and then add these probabilities together.

Example: Application of Binomial Distribution

A quiz consists of 10 multiple-choice questions with four choices: A, B, C, and D. I did not study and randomly picked one answer for each question.

  1. Find the probability that I get six correct answers.
  2. Find the probability that I get at least one correct answer.
  3. Find the probability that I get at least nine correct answers.
  4. How many correct answers do you expect me to get?

Solutions:

For each question, I either get the correct answer or not. Since I randomly picked one answer, each of the four choices had the same chance of being chosen. Therefore, since there is only one correct answer, the probability of obtaining the correct answer is ¼. Whether I obtain the correct answer for the current question will not affect the chance of getting the correct answer for the next question; therefore, the trials are independent with a constant probability of success. Let [latex]X[/latex] = # of correct answers, then [latex]X[/latex] follows a binomial distribution.

  1. Identify the success event. Since [latex]X[/latex] =# of correct answer= #of successes, getting a correct answer in each guess is a success.
  2. Determine the probability of success p.
    [latex]p = \text{probability of getting a correct answer} = \frac{1}{4}=0.25.[/latex]
  3. Determine [latex]n[/latex] the total number of trials. Since the quiz has 10 questions, and we randomly pick one answer per question, this is a sequence of 10 Bernoulli trials. Therefore, [latex]n = 10[/latex].

Let [latex]X[/latex] = # of correct answers, then [latex]X[/latex] follows a binomial distribution with 0.25. The probability distribution is

[latex]P(X=x) = _nC_xp^x(1-p)^{n-x} = _{10}C_x0.25^x(1-0.25)^{10-x}  = _{10}C_x0.25^x 0.75^{10-x},[/latex]

for [latex]x=0, 1, \cdots, 10[/latex].

  1. Find the probability that I get six correct answers.
    Event: [latex]\{X = 6\}[/latex] with probability [latex]P(X=6) = _{10}C_6(0.25^6)(0.75^{10-6}) = 210(0.25^6)(0.75^{4}) = 0.01622.[/latex]
  1. Find the probability that I get at least one correct answer.

Event: [latex]\{X \geq 1\}[/latex]. That is one or more correct answers. By the complement rule,

[latex]\begin{align*} P(X \geq 1) &= P(X=1) + P(X+2) + ... + P(X=9) + P(X=10) = 1 - P(X=0) \\ &= 1 - _{10}C_0 (0.25^0)(0.75^{10-0}) =1-0.75^{10}= 0.9437. \end{align*}[/latex]

By using the complement rule, we only need to apply the binomial probability formula once; however, we would need to apply it 10 times if we add [latex]P(X=1)[/latex] to [latex]P(X=10)[/latex].

  1. Find the probability that I get at least nine correct answers.

Event: [latex]\{X \geq 9\}[/latex]. That is 9 or 10 correct answers.

[latex]\begin{align*} P(X \geq 9) &= P(X=9) + P(X=10) \\ &= _{10} C_9(0.25^9)(0.75^{10-9}) + _{10}C_{10}(0.25)^{10}(0.75^{10-10}) \\ &= 10(0.25^9)(0.75^{1}) + 1(0.25)^{10}(0.75^{0}) \\ &= 2.861023 \times 10^{-5} + 9.536743 \times 10^{-7} \\ &= 2.9564 \times 10^{-5}. \end{align*}[/latex]

  1. How many correct answers do you expect me to get?
    Since X follows a binomial distribution, the expected value (the mean) of X is
    [latex]\mu = np = 10 \times 0.25 = 2.5.[/latex]

Interpretation: For every 10 questions, I expect to obtain 2.5 correct guesses.

Note that we do not round the expected value. Even though it is not possible to observe 2.5 correct answers, this would be the long running average if I was to repeatedly conduct this experiment. An alternative viewpoint is this: expecting 2.5 correct answers for every 10 guesses is equivalent to expecting 25 correct answers for every 100 guesses, 250 correct answers for every 1000 guesses, and so on.

Exercise: Lotto 649

Lotto 649 launched in 1982 is one of three national lottery games in Canada. Each play costs $3 and includes one set of 6 numbers ranging from 1 to 49 for the Main Jackpot Draw. If the 6 numbers a player chosen match all the 6 winning numbers (order does not matter), he wins the Jackpot. The six winning numbers were 2, 8, 9, 16, 39, and 49 for the Wednesday, April 6 Lotto 649, and Jackpot’s winning prize was $18.7 million.

If I buy one Lotto 649 ticket each month for the next 10 years, what is the probability that I will win at least one jackpot?

  1. Do we have independent Bernoulli trials?
  2. Let [latex]X=[/latex] # of jackpots I win over the next 10 years. Does [latex]X[/latex] follow a binomial distribution?
Show/Hide Answer
  1. For each Lotto 649 ticket, I have only two possible outcomes: either win the jackpot or do not win the jackpot. Purchasing one ticket for each per month for the next 10 years yields a total of 12 x 10 = 120 tickets, i.e., 120 independent Bernoulli trials.
  2. The question asks for # of jackpots to be won in the coming 10 years; therefore, winning a jackpot is a success. The probability of success is [latex]p = \frac{1}{_{49}C_6}[/latex].
    Let X = # of jackpots won in the coming 10 years. Then X follows a binomial distribution with [latex]n=10, p = \frac{1}{_{49}C_6} = 0.0000000715[/latex]. The probability distribution is
    [latex]\begin{align*} P(X=x) &= _nC_xp^x(1-p)^{n-x}\\  &= _{120} C_x 0.0000000715^x(1 - 0.0000000715)^{120-x}. \end{align*}[/latex]

Event: at least one Jackpot= [latex]\{ X \geq 1 \}[/latex] with probability

[latex]\begin{align*} P(X \geq 1) &= 1 - P(X=0) \\ &= 1 - _{120}C_0 (0.0000000715^0)[(1 - 0.0000000715)^{120-0}] \\ &=1-(1 - 0.0000000715)^{120}\\ &= 1 - 0.9999914 = 0.0000086. \end{align*}[/latex]

Exercise: Application of Binomial Distribution

Roll a balanced die four times,

  1. Find the probability of observing a six at least once.
  2. Find the probability of observing a six exactly once.
  3. Find the probability of observing a six between two and four times inclusively.
  4. How many times do we expect to observe a six?
Show/Hide Answer

Let X = number of times observing a six is observed among the four rolls, then X follows a binomial distribution with n=4, p=1/6.

[latex]\begin{align*} P(X \geq 1) &= 1- P(X=0) \\ &= 1 - _4C_0 (\frac{1}{6})^0 (1 - \frac{1}{6})^{4-0} \\ &= 1 - 0.482 = 0.518. \end{align*}[/latex]

[latex]\begin{align*} P(X =1) &= _4C_1 (\frac{1}{6})^1 (1 - \frac{1}{6})^{4-1} \\ &= 0.386. \end{align*}[/latex]

[latex]\begin{align*} P(2 \leq X \leq 4) &=  P(X=2) + P(X=3) + P(X=4) \\ &= 1 - P(X=0) - P(X=1) \\ &= 1 - _4C_0 (\frac{1}{6})^0 (1 - \frac{1}{6})^{4-0} - _4C_1 (\frac{1}{6})^1 (1 - \frac{1}{6})^{4-1} \\ &= 1 - 0.482 - 0.386 = 0.132. \end{align*}[/latex]

  1. Since X follows a binomial distribution, its mean (expected value) is [latex]\mu = np = 4 \times \frac{1}{6} = \frac{2}{3} = 0.667[/latex].

 

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Introduction to Applied Statistics Copyright © 2024 by Wanhua Su is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.