next up previous
Next: The central limit theorem Up: Probability theory Previous: Application to the binomial

The Gaussian distribution

Consider a very large number of observations, $N\gg 1$, made on a system with two possible outcomes. Suppose that the probability of outcome 1 is sufficiently large that the average number of occurrences after $N$ observations is much greater than unity:
\begin{displaymath}
\overline{n_1} = N\,p \gg 1.
\end{displaymath} (54)

In this limit, the standard deviation of $n_1$ is also much greater than unity,
\begin{displaymath}
{\mit\Delta}^\ast n_1 = \sqrt{N\,p\,q}\gg 1,
\end{displaymath} (55)

implying that there are very many probable values of $n_1$ scattered about the mean value $\overline{n_1}$. This suggests that the probability of obtaining $n_1$ occurrences of outcome 1 does not change significantly in going from one possible value of $n_1$ to an adjacent value:
\begin{displaymath}
\frac{\vert P_N(n_1+1)-P_N(n_1)\vert}{P_N(n_1)} \ll 1.
\end{displaymath} (56)

In this situation, it is useful to regard the probability as a smooth function of $n_1$. Let $n$ be a continuous variable which is interpreted as the number of occurrences of outcome 1 (after $N$ observations) whenever it takes on a positive integer value. The probability that $n$ lies between $n$ and $n+dn$ is defined
\begin{displaymath}
P(n, n+dn) = {\cal P}(n)\, dn,
\end{displaymath} (57)

where ${\cal P}(n)$ is called the probability density, and is independent of $dn$. The probability can be written in this form because $P(n, n+dn)$ can always be expanded as a Taylor series in $dn$, and must go to zero as $dn\rightarrow 0$. We can write
\begin{displaymath}
\int_{n_1-1/2}^{n_1+1/2} {\cal P}(n)\, dn = P_N(n_1),
\end{displaymath} (58)

which is equivalent to smearing out the discrete probability $P_N(n_1)$ over the range $n_1\pm 1/2$. Given Eq. (56), the above relation can be approximated
\begin{displaymath}
{\cal P}(n) \simeq P_N(n) = \frac{N!}{n!\,(N-n)!}\,p^n\,q^{N-n}.
\end{displaymath} (59)

For large $N$, the relative width of the probability distribution function is small:

\begin{displaymath}
\frac{{\mit\Delta}^\ast n_1}{\overline{n_1}} = \sqrt{\frac{q}{p}}\frac{1}{\sqrt{N}}
\ll 1.
\end{displaymath} (60)

This suggests that ${\cal P}(n)$ is strongly peaked around the mean value $\overline{n}=\overline{n_1}$. Suppose that $\ln{\cal P}(n)$ attains its maximum value at $n=\tilde{n}$ (where we expect $\tilde{n}\sim
\overline{n}$). Let us Taylor expand $\ln{\cal P}$ around $n=\tilde{n}$. Note that we expand the slowly varying function $\ln{\cal P}(n)$, instead of the rapidly varying function ${\cal P}(n)$, because the Taylor expansion of ${\cal P}(n)$ does not converge sufficiently rapidly in the vicinity of $n=\tilde{n}$ to be useful. We can write
\begin{displaymath}
\ln {\cal P}(\tilde{n}+\eta) \simeq \ln{\cal P}(\tilde{n}) +\eta\,
B_1+\frac{\eta^2}{2}\,B_2+\cdots ,
\end{displaymath} (61)

where
\begin{displaymath}
B_k = \left.\frac{d^k \ln {\cal P}}{d n^k}\right\vert _{n=\tilde{n}}.
\end{displaymath} (62)

By definition,
$\displaystyle B_1$ $\textstyle =$ $\displaystyle 0,$ (63)
$\displaystyle B_2$ $\textstyle <$ $\displaystyle 0,$ (64)

if $n=\tilde{n}$ corresponds to the maximum value of $\ln{\cal P}(n)$.

It follows from Eq. (59) that

\begin{displaymath}
\ln {\cal P} = \ln N! - \ln n!- \ln \,(N-n)! +n\ln p +(N-n)\ln q.
\end{displaymath} (65)

If $n$ is a large integer, such that $n\gg 1$, then $\ln n!$ is almost a continuous function of $n$, since $\ln n!$ changes by only a relatively small amount when $n$ is incremented by unity. Hence,
\begin{displaymath}
\frac{d\ln n!}{dn} \simeq \frac{\ln\,(n+1)!-\ln n!}{1} =
\ln\!\left[\frac{(n+1)!}{n!}\right] = \ln\,(n+1),
\end{displaymath} (66)

giving
\begin{displaymath}
\frac{d\ln n!}{d n} \simeq \ln n,
\end{displaymath} (67)

for $n\gg 1$. The integral of this relation
\begin{displaymath}
\ln n! \simeq n\,\ln n - n +{\cal O}(1),
\end{displaymath} (68)

valid for $n\gg 1$, is called Stirling's approximation, after the Scottish mathematician James Stirling who first obtained it in 1730.

According to Eq. (65),

\begin{displaymath}
B_1 = -\ln\tilde{n} +\ln\,(N-\tilde{n})+\ln p - \ln q.
\end{displaymath} (69)

Hence, if $B_1=0$ then
\begin{displaymath}
(N-\tilde{n})\, p = \tilde{n}\,q,
\end{displaymath} (70)

giving
\begin{displaymath}
\tilde{n} = N\,p = \overline{n_1},
\end{displaymath} (71)

since $p+q=1$. Thus, the maximum of $\ln{\cal P}(n)$ occurs exactly at the mean value of $n$, which equals $\overline{n_1}$.

Further differentiation of Eq. (65) yields

\begin{displaymath}
B_2 = -\frac{1}{\tilde{n}}-\frac{1}{N-\tilde{n}} =
-\frac{1}{Np}-\frac{1}{N\,(1-p)}= - \frac{1}{N\,p\,q},
\end{displaymath} (72)

since $p+q=1$. Note that $B_2<0$, as required. The above relation can also be written
\begin{displaymath}
B_2 = -\frac{1}{({\mit\Delta}^\ast n_1)^2}
\end{displaymath} (73)

It follows from the above that the Taylor expansion of $\ln{\cal P}$ can be written

\begin{displaymath}
\ln{\cal P}(\overline{n_1}+\eta) \simeq \ln{\cal P}(\overline{n_1}) -
\frac{\eta^2}{2\,({\mit\Delta}^\ast n_1)^2} +\cdots.
\end{displaymath} (74)

Taking the exponential of both sides yields
\begin{displaymath}
{\cal P}(n)\simeq {\cal P}(\overline{n_1})\exp\!\left[-
\frac{(n-\overline{n_1})^2}{2\,({\mit\Delta}^\ast n_1)^2}\right].
\end{displaymath} (75)

The constant ${\cal P}(\overline{n_1})$ is most conveniently fixed by making use of the normalization condition
\begin{displaymath}
\sum_{n_1=0}^N P_N(n_1)=1,
\end{displaymath} (76)

which translates to
\begin{displaymath}
\int_0^N {\cal P}(n)\,dn \simeq 1
\end{displaymath} (77)

for a continuous distribution function. Since we only expect ${\cal P}(n)$ to be significant when $n$ lies in the relatively narrow range $\overline{n_1}\pm {\mit\Delta}^\ast n_1$, the limits of integration in the above expression can be replaced by $\pm \infty$ with negligible error. Thus,
\begin{displaymath}
{\cal P}(\overline{n_1})\int_{-\infty}^{\infty}\!\exp\!
\lef...
...Delta}^\ast n_1\int_{-\infty}^{\infty}
\exp(-x^2)\,dx\simeq 1.
\end{displaymath} (78)

As is well-known,

\begin{displaymath}
\int_{-\infty}^{\infty}
\exp(-x^2)\,dx = \sqrt{\pi},
\end{displaymath} (79)

so it follows from the normalization condition (78) that
\begin{displaymath}
{\cal P}(\overline{n_1})\simeq \frac{1}{
\sqrt{2\pi} \,{\mit\Delta}^\ast n_1}.
\end{displaymath} (80)

Finally, we obtain
\begin{displaymath}
{\cal P}(n) \simeq \frac{1}{\sqrt{2\pi} \,{\mit\Delta}^\ast ...
...ac{(n-\overline{n_1})^2}{2\,({\mit\Delta}^\ast n_1)^2}\right].
\end{displaymath} (81)

This is the famous Gaussian distribution function, named after the German mathematician Carl Friedrich Gauss, who discovered it whilst investigating the distribution of errors in measurements. The Gaussian distribution is only valid in the limits $N\gg 1$ and $\overline{n_1}\gg 1$.

Suppose we were to plot the probability $P_N(n_1)$ against the integer variable $n_1$, and then fit a continuous curve through the discrete points thus obtained. This curve would be equivalent to the continuous probability density curve ${\cal P}(n)$, where $n$ is the continuous version of $n_1$. According to Eq. (81), the probability density attains its maximum value when $n$ equals the mean of $n_1$, and is also symmetric about this point. In fact, when plotted with the appropriate ratio of vertical to horizontal scalings, the Gaussian probability density curve looks rather like the outline of a bell centred on $n= \overline{n_1}$. Hence, this curve is sometimes called a bell curve. At one standard deviation away from the mean value, i.e., $n=\overline{n_1}\pm {\mit\Delta}^\ast n_1$, the probability density is about 61% of its peak value. At two standard deviations away from the mean value, the probability density is about 13.5% of its peak value. Finally, at three standard deviations away from the mean value, the probability density is only about 1% of its peak value. We conclude that there is very little chance indeed that $n_1$ lies more than about three standard deviations away from its mean value. In other words, $n_1$ is almost certain to lie in the relatively narrow range $\overline{n_1}\pm 3\,{\mit\Delta}^\ast n_1$. This is a very well-known result.

In the above analysis, we have gone from a discrete probability function $P_N(n_1)$ to a continuous probability density ${\cal P}(n)$. The normalization condition becomes

\begin{displaymath}
1= \sum_{n_1=0}^N P_N(n_1) \simeq \int_{-\infty}^{\infty}{\cal P}(n)\, dn
\end{displaymath} (82)

under this transformation. Likewise, the evaluations of the mean and variance of the distribution are written
\begin{displaymath}
\overline{n_1} = \sum_{n_1=0}^N P_N(n_1)\,n_1 \simeq \int_{-\infty}^{\infty}
{\cal P}(n)\,n\,dn,
\end{displaymath} (83)

and
\begin{displaymath}
\overline{({\mit\Delta} n_1)^2}\equiv
({\mit\Delta}^\ast n_1...
...\int_{-\infty}^{\infty}{\cal P}(n)
\,(n-\overline{n_1})^2\,dn,
\end{displaymath} (84)

respectively. These results follow as simple generalizations of previously established results for the discrete function $P_N(n_1)$. The limits of integration in the above expressions can be approximated as $\pm \infty$ because ${\cal P}(n)$ is only non-negligible in a relatively narrow range of $n$. Finally, it is easily demonstrated that Eqs. (82)-(84) are indeed true by substituting in the Gaussian probability density, Eq. (81), and then performing a few elementary integrals.


next up previous
Next: The central limit theorem Up: Probability theory Previous: Application to the binomial
Richard Fitzpatrick 2006-02-02