4.5 The central limit theorem

The central limit theorem is, together with the law of large numbers, the second most fundamental result in probability. It states that the sum of a large number of independent identically distributed random variables has approximately a Gaussian distribution, no matter what the distribution of these variables is. This provides at least a partial theoretical justification17 for the ubiquity of the Gaussian distribution in probability and statistics. This represents the first instance of a remarkable phenomenon in probability and statistical physics called universality: if you take a complicated system made up of many small parts, the behaviour of the system on large scales is universal in the sense that it does not depend on the details of the individual parts18. In this instance, the universal behaviour is the Gaussian distribution of the sum, no matter the distribution of the individual random variables.

Let \(X_1, X_2, \dots\) be a sequence of independent identically distributed real-valued random variables in \(L^1.\) The law of large numbers states that \[\frac{1}{n}(X_1 + \cdots + X_n) \longrightarrow \mathbb{E}[X_1]\] almost surely as \(n \to \infty.\) It is natural to ask how fast this convergence takes place, i.e. what is the typical size, or scale, of \(\frac{1}{n}(X_1 + \cdots + X_n) - \mathbb{E}[X_1],\) as a function of \(n.\)

For \(X_1 \in L^2,\) the answer is easy. Indeed, since \[\mathbb{E}\bigl[(X_1 + \cdots + X_n - n \mathbb{E}[X_1])^2\bigr] = \mathop{\mathrm{Var}}(X_1 + \cdots + X_n) = n \mathop{\mathrm{Var}}(X_1)\,,\] we find that \[\tag{4.16} \frac{1}{\sqrt{n}} (X_1 + \cdots + X_n - n \mathbb{E}[X_1])\] is typically of order one (since the expectation of its square is equal to \(\mathop{\mathrm{Var}}(X_1),\) which does not depend on \(n\)).

The central limit theorem is a more precise version of this observation, as it even identifies the limiting law of (4.16).

Proposition 4.24 • Central limit theorem

Let \(X_1, X_2, \dots\) be independent identically distributed random variables in \(L^2,\) with variance \(\sigma^2.\) Then, as \(n \to \infty,\) the quantity (4.16) converges in law to a Gaussian random variable with mean zero and variance \(\sigma^2.\)

Proof. Using the technology of characteristic functions developed in the previous section, the proof is remarkably straightforward. First, without loss of generality we may suppose that \(\mathbb{E}[X_1] = 0\) (otherwise just replace \(X_n\) with \(X_n - \mathbb{E}[X_n]\)).

We shall use that for any random variable \(X \in L^2\) we have19 \[\tag{4.17} \Phi_X(\xi) = 1 + \mathrm i\xi \mathbb{E}[X] - \frac{1}{2} \xi^2 \mathbb{E}[X^2] + o(\xi^2)\] as \(\xi \to 0.\) To show (4.17), we differentiate under the expectation, using that \(X \in L^2,\) to obtain \[\Phi_X'(\xi) = \mathrm i\, \mathbb{E}[X \, \mathrm e^{\mathrm i\xi X}]\,,\] and differentiating again yields \[\Phi_X''(\xi) = - \mathbb{E}[X^2 \, \mathrm e^{\mathrm i\xi X}]\,.\] Note that differentiating inside the expectation is allowed since \(X \in L^2.\) By Taylor’s theorem, we therefore have \[\begin{aligned} \Phi_X(\xi) &= 1 + \mathrm i\, \mathbb{E}[X] \, \xi - \int_0^\xi \, \mathbb{E}[X^2 \, \mathrm e^{\mathrm it X}] \, (\xi - t) \, \mathrm dt \\ &= 1 + \mathrm i\, \mathbb{E}[X] \, \xi - \frac{1}{2} \xi^2 \mathbb{E}[X^2] - \int_0^\xi \, \mathbb{E}[X^2 \, (\mathrm e^{\mathrm it X} - 1)] \, (\xi - t) \, \mathrm dt\,. \end{aligned}\] The expectation under the last integral tends to zero as \(t \to 0,\) by the dominated convergence theorem. Hence, the whole integral is \(o(\xi^2),\) and we obtain (4.17).

With \(Z_n :=\frac{X_1 + \cdots + X_n}{\sqrt{n}}\) we have, by independence of the variables \(X_1, \dots, X_n,\) \[\Phi_{Z_n}(\xi) = \mathbb{E}\biggl[\exp \biggl(\mathrm i\xi \frac{X_1 + \cdots + X_n}{\sqrt{n}}\biggr)\biggr] = \mathbb{E}[\exp(\mathrm i\xi X_1 / \sqrt{n})]^n = \Phi_{X_1}(\xi / \sqrt{n})^n\,.\] By (4.17), we therefore get, for any \(\xi \in \mathbb{R},\) \[\Phi_{Z_n}(\xi) = \biggl(1 - \frac{\sigma^2 \xi^2}{2 n} + o\biggl(\frac{\xi^2}{n}\biggr)\biggr)^n \longrightarrow \mathrm e^{-\frac{\sigma^2}{2} \xi^2}\] as \(n \to \infty.\) The claim now follows from Propositions Proposition 4.20 and Proposition 4.23.

This is the end of this course. I hope you enjoyed it!

Now you know all of the fundamentals of probability. If you liked what you learned (as I hope!), you are fully equipped to go on and learn about more advanced topics such as Markov chains and martingales.

Home

Contents

Weeks