Statistics Homeworks

The Glivenko–Cantelli theorem

The Glivenko-Cantelli theorem, also called as the Fundamental Theorem of Statistics, describe the asymptotic behaviour of the empirical distribution function when the number of independent and identically distributed random variables grows.

For better understanding what Glivenko–Cantelli theorem is, we must first define the following two definitions:

Definition:

Let \(X_1, \dots, X_n\) be independent and identically distributed real random variables with CDF \(F(x)\). The EDF for \(X_1, \dots, X_n\) is defined as follows: \[ F_n(x)=\frac{1}{n}\sum_{i=1}^{n} I_{[X, \infty]}(x) \]

The Glivenko-Cantelli theorem states that, as the number of independent and identically distributed random variables \(n\) grows, the empirical distribution function \(F_n(x)\), based on these variables, converges almost surely and uniformly to the true cumulative distribution function \(F(x)\). Formally, this can be represented as follows:

\[ \lim_{{n \to \infty}} P\left( \left| F_n(x) - F(x) \right| > \epsilon\right) = 0 \]

Here \(n\) is the number of observations or sample size; \(\sup_x\) denotes the supremum over all possible values of \(x\) in the range of the random variable; \(\epsilon\) is a small positive value; \(\lim_{{n \to \infty}}\) denotes the limit as \(n\) approaches infinity.

Simulations:

References: [1]