next up previous
Next: Gaussian Probability Distribution Up: Probability Theory Previous: Mean, Variance, and Standard

Application to Binomial Probability Distribution

Let us now apply what we have just learned about the mean, variance, and standard deviation of a general probability distribution function to the specific case of the binomial probability distribution. Recall, from Section 2.6, that if a simple system has just two possible outcomes, denoted 1 and 2, with respective probabilities $ p$ and $ q=1-p$ , then the probability of obtaining $ n_1$ occurrences of outcome 1 in $ N$ observations is

$\displaystyle P_N(n_1) = \frac{N!}{n_1 ! (N-n_1)!}  p^{ n_1} q^{ N-n_1}.$ (2.38)

Thus, making use of Equation (2.27), the mean number of occurrences of outcome 1 in $ N$ observations is given by

$\displaystyle \overline{n_1} = \sum_{n_1=0,N} P_N(n_1) n_1 = \sum_{n_1=0,N} \frac{N!}{n_1! (N-n_1)!} p^{ n_1} q^{ N-n_1}  n_1.$ (2.39)

We can see that if the final factor $ n_1$ were absent on the right-hand side of the previous expression then it would just reduce to the binomial expansion, which we know how to sum. [See Equation (2.23).] We can take advantage of this fact using a rather elegant mathematical sleight of hand. Observe that because

$\displaystyle n_1 p^{ n_1} \equiv p \frac{\partial}{\partial p} p^{ n_1},$ (2.40)

the previous summation can be rewritten as

$\displaystyle \sum_{n_1=0,N}\frac{N!}{n_1! (N-n_1)!} p^{ n_1} q^{ N-n_1} ...
...\left[\sum_{n_1=0,N} \frac{N!}{n_1! (N-n_1)!} p^{ n_1} q^{ N-n_1} \right].$ (2.41)

The term in square brackets is now the familiar binomial expansion, and can be written more succinctly as $ (p+q)^{ N}$ . Thus,

$\displaystyle \sum_{n_1=0,N}\frac{N!}{n_1! (N-n_1)!} p^{ n_1} q^{ N-n_1}  n_1 =p \frac{\partial}{\partial p}  (p+q)^{ N}= p N (p+q)^{ N-1}.$ (2.42)

However, $ p+q=1$ for the case in hand [see Equation (2.11)], so

$\displaystyle \overline{n_1} = N p.$ (2.43)

In fact, we could have guessed the previous result. By definition, the probability, $ p$ , is the number of occurrences of the outcome 1 divided by the number of trials, in the limit as the number of trials goes to infinity:

$\displaystyle p= _{lt N\rightarrow\infty }\frac{n_1}{N}.$ (2.44)

If we think carefully, however, we can see that taking the limit as the number of trials goes to infinity is equivalent to taking the mean value, so that

$\displaystyle p = \overline{\left(\frac{n_1}{N}\right)} = \frac{\overline{n_1}}{N}.$ (2.45)

But, this is just a simple rearrangement of Equation (2.43).

Let us now calculate the variance of $ n_1$ . Recall, from Equation (2.36), that

$\displaystyle \overline{({\mit\Delta} n_1)^{ 2}}= \overline{(n_1)^{ 2}} - (\overline{n_1})^{ 2}.$ (2.46)

We already know $ \overline{n_1}$ , so we just need to calculate $ \overline{(n_1)^{ 2}}$ . This average is written

$\displaystyle \overline{(n_1)^{ 2}}=\sum_{n_1=0,N}\frac{N!}{n_1! (N-n_1)!} p^{ n_1}  q^{ N-n_1} (n_1)^{ 2}.$ (2.47)

The sum can be evaluated using a simple extension of the mathematical trick that we used previously to evaluate $ \overline{n_1}$ . Because

$\displaystyle (n_1)^{ 2}  p^{ n_1} \equiv \left(p \frac{\partial}{\partial p}\right)^{ 2} p^{ n_1},$ (2.48)

then

$\displaystyle \sum_{n_1=0,N}\frac{N!}{n_1! (N-n_1)!} p^{ n_1} q^{ N-n_1} (n_1)^{ 2}$ $\displaystyle \equiv \left(p \frac{\partial}{\partial p}\right)^2\sum_{n_1=0,N} \frac{N!}{n_1! (N-n_1)!} p^{ n_1}q^{ N-n_1}$    
  $\displaystyle = \left(p \frac{\partial}{\partial p}\right)^2(p+q)^{ N}$    
  $\displaystyle =\left(p \frac{\partial}{\partial p}\right)\left[p N  (p+q)^{ N-1}\right]$    
  $\displaystyle = p\left[N (p+q)^{ N-1}+p N (N-1) (p+q)^{ N-2}\right].$ (2.49)

Using $ p+q=1$ , we obtain

$\displaystyle \overline{(n_1)^{ 2}}$ $\displaystyle = p\left[N+p N (N-1)\right]= N p\left[1+p N-p\right]$    
  $\displaystyle = (N p)^{ 2} + N p q = \left(\overline{n_1}\right)^{ 2} + N p q,$ (2.50)

because $ \overline{n_1}= N p$ . [See Equation (2.43).] It follows that the variance of $ n_1$ is given by

$\displaystyle \overline{({\mit\Delta} n_1)^{ 2}}= \overline{(n_1)^{ 2}}- \left(\overline{n_1}\right)^{ 2} = N p q.$ (2.51)

The standard deviation of $ n_1$ is the square root of the variance [see Equation (2.37)], so that

$\displaystyle {\mit\Delta}^\ast n_1 = \sqrt{N p q}.$ (2.52)

Recall that this quantity is essentially the width of the range over which $ n_1$ is distributed around its mean value. The relative width of the distribution is characterized by

$\displaystyle \frac{{\mit\Delta}^\ast n_1}{\overline{n_1}}= \frac{\sqrt{N  p q}}{N p} = \sqrt{\frac{q}{p}}\frac{1}{\sqrt{N}}.$ (2.53)

It is clear, from this formula, that the relative width decreases with increasing $ N$ like $ N^{ -1/2}$ . So, the greater the number of trials, the more likely it is that an observation of $ n_1$ will yield a result that is relatively close to the mean value, $ \overline{n_1}$ .


next up previous
Next: Gaussian Probability Distribution Up: Probability Theory Previous: Mean, Variance, and Standard
Richard Fitzpatrick 2016-01-25