The Glivenko–Cantelli theorem, Proof, Simulations

The Glivenko–Cantelli theorem

The Glivenko-Cantelli theorem, also called as the Fundamental Theorem of Statistics, describe the asymptotic behaviour of the empirical distribution function when the number of independent and identically distributed random variables grows.

For better understanding what Glivenko–Cantelli theorem is, we must first define the following two definitions:

The Cumulative Distribution Function (CDF) of a random variable \(X\), evaluated at \(x\), is the probability that \(X\) will take a value less than or equal to \(x\): \( F_X(x) = P(X \le x) \).

The Empirical Distribution Function (EDF), for a given set of observed data points, assigns a probability to each data point based on its rank order in the dataset. It provides a step function that increases by \(1/n\) at each data point, where \(n\) is the total number of observations. Mathematically, the EDF is defined as: \[ F_n(x) = \frac{1}{n} \sum_{i=1}^{n} I(X_i \le x) \] where \(n\) is the number of observations, \(X_i\) is the \(i\)-th observation, and \(I(\cdot)\) is the indicator function.

Definition:

Let \(X_1, \dots, X_n\) be independent and identically distributed real random variables with CDF \(F(x)\). The EDF for \(X_1, \dots, X_n\) is defined as follows: \[ F_n(x)=\frac{1}{n}\sum_{i=1}^{n} I_{[X, \infty]}(x) \]

The Glivenko-Cantelli theorem states that, as the number of independent and identically distributed random variables \(n\) grows, the empirical distribution function \(F_n(x)\), based on these variables, converges almost surely and uniformly to the true cumulative distribution function \(F(x)\). Formally, this can be represented as follows:

\[ \lim_{{n \to \infty}} P\left( \left| F_n(x) - F(x) \right| > \epsilon\right) = 0 \]

Here \(n\) is the number of observations or sample size; \(\sup_x\) denotes the supremum over all possible values of \(x\) in the range of the random variable; \(\epsilon\) is a small positive value; \(\lim_{{n \to \infty}}\) denotes the limit as \(n\) approaches infinity.

The Glivenko–Cantelli theorem

Definition:

Simulations:

References: [1]