Statistics Homeworks

Algorithms for random variates generation

Here you can find the complete thesis about this topic.

In statistics, a random variate is a specific outcome of a random variable, which is used to simulate stochastic processes. The generation of a random variate from a given distribution is known as random number generation or non-uniform pseudo-random variate generation.

There are several methods/algorithms to generate a random variate, in which the main goal is to use \(\mathcal{U}(0, 1)\) (i.e. the standard uniform distribution, where every point in the continous range \([0.0, 1.0]\) has the same opportunity of appearing) to generate variates from other distributions, e.g. discrete distributions like Bernoulli, Binomial and Poisson, or continous distributions like exponential or normal; below are the main ones.


1. Inverse Transform Method
Let \(X\) be a continous R.V. with cdf \(F(x)\). Then \[F(x) \sim \mathcal{U}(0, 1).\] Is it possible to define the inverse cdf as follows: \[F^{-1}(u) = inf[x: F(x) \geq u] \quad u \in [0, 1].\] Let \(U \sim \mathcal{U}(0,1) \). Then the random variable \( F^{-1}(U) \) has the same distribution as \(X\). The inverse transform method for generating a RV \(X\) having cdf \(F(x)\) is the following:

  1. Sample \(U\) from \(\mathcal{U}(0, 1)\).
  2. Return \(X = F^{-1}(U)\).


2. Cutpoint Method
In this case we want to generate from the following discrete distribution: \[P(X=k) = p_k, \quad k = a, a+1, \dots, b \] with large \(b - a\).
Let \[ q_k = P(X \leq k), \quad k = a, a+1, \dots, b.\] For fixed \(m\), the cutpoint method of Fishman and Moore computes and store the cutpoints: \[I_j = min \left[ k:q_k > \frac{j-1}{m} \right], \quad j=1,\dots, m. \] With these cutpoints is possible to scan trough the list of possible \(k\)-values much more quickly than regular inverse transform.

The algorithms that computes the cutpoints is the following:

Then we can use the cutpoint method:

Briefly, the algorithm select an integer \(L = \lfloor mU \rfloor + 1\) and starts the search at the value \( I_L \). It is correct because of: \[ P(I_L \leq X \leq I_{L+1}) = 1 \quad (I_{m+1} = b). \]


3. Convolution Method
Convolution means adding things up.

Example: Binomial(\(n, p\)).
Suppose \(X_1, \dots, X_n\) are indipendent and identically Distributed (iid) Bern\((p)\). Then \(Y = \sum_{i=1}^{n} X_i \sim \) Bin(\(n, p\)).
How to get Bernoullis RV's?
Suppose \( U_1, \dots, U_n \) are iid U(0,1):
  If \(U_i \leq p \) then \(X_i = 1\)
  Otherwise \(X_i = 0\)
Repeat for \(i = 1, \dots, n\).


4. Acceptance-Rejection Method

Simple Example: Generate a \( \mathcal{U} (2/3, 1) \) RV.


5. Composition Method

Suppose a RV actually comes from two RV's. The goal is to generate a RV with the following cdf: \[F(x) = \sum_{j=1}^{\infty} p_j F_j (x)\] where \(p_j > 0\) for all \(j, \sum jP_j = 1\), and the \(F_j(x)\)'s are "easy" cdf's to generate from.

  1. Generate a positive integer \(J\) such that \(P(J=j) = p_j\) for all \(j\).
  2. Return \(X\) from cdf \(F_j(x)\).


6. Box-Muller Method

It is an easy way to generate standard normals.


Theorem: If \(U_1, U_2\) are iid \(\mathcal{U}(0,1)\), then:
\[Z_1 = \sqrt{-2ln(U_1)} cos(2\pi U_2)\] \[Z_2 = \sqrt{-2ln(U_1)} cos(2\pi U_2)\] are iid Nor(0,1).


7. Polar Method

It is a little faster than Box-Muller:

  1. Generate \(U_1, U_2\) iid \(\mathcal{U}(0,1)\)
      Let \(V_i = 2U_i - 1\) , \(i = 1, 2\), and \(W = V^{2}_1 + V^{2}_2\).
  2. If \(W > 1\), REJECT and go to 1
    Otherwise, let \(Y = \sqrt{-2(ln(W))/W}\), and ACCEPT \( Z_i \leftarrow V_{i}Y, i = 1,2 \).
Then \(Z_1, Z_2\) are iid Nord(0, 1).

References: For the thoery [1], [2].
The algorithms are taken from: [3].