stat341 / CM 361
Computational Statistics and Data Analysis is a course offered at the University of Waterloo
Spring 2009
Instructor: Ali Ghodsi
Sampling (Generating Random numbers)
Lecture of May 12th, 2009
In order to study statistics computationally, we need a good way to generate random numbers from various distributions using computational methods, or at least numbers whose distribution appears to be random (pseudo-random). Outside a computational setting, this is fairly easy (at least for the uniform distribution). Rolling a die, for example, produces numbers with a uniform distribution very well.
We begin by considering the simplest case: the uniform distribution.
One way to generate pseudorandom numbers is using the Multiplicative Congruential Method. This involves three integer parameters a, b, and n, and a seed variable x0. This method deterministically (based on the seed) generates a sequence of numbers with a seemingly random distribution (with some caveats). It proceeds as follows:
- [math]\displaystyle{ x_{i+1} = ax_{i} + b \mod{n} }[/math]
For example, with a = 13, b = 0, m = 31, x0 = 1, we have:
- [math]\displaystyle{ x_{i+1} = 13x_{i} \mod{31} }[/math]
So,
- [math]\displaystyle{ \begin{align} x_{0} &{}= 1 \end{align} }[/math]
- [math]\displaystyle{ \begin{align} x_{1} &{}= 13*1 + 0 \mod{31} \\ &{}= 13 \end{align} }[/math]
- [math]\displaystyle{ \begin{align} x_{2} &{}= 13*13 + 0 \mod{31} \\ &{}= 14 \end{align} }[/math]
etc.
Inverse Transform Method
Step 1: Draw [math]\displaystyle{ U~ \sim~ Unif [0,1] }[/math].
Step 2: Compute [math]\displaystyle{ X = F^{-1}(U) }[/math].
Example:
Suppose we want to draw a sample from [math]\displaystyle{ f(x) = \lambda e^{-\lambda x} }[/math] where [math]\displaystyle{ x\gt 0 }[/math].
We need to first find [math]\displaystyle{ F(x) }[/math] and
then [math]\displaystyle{ F^{-1} }[/math].
[math]\displaystyle{ F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} }[/math]
[math]\displaystyle{ F^{-1}(x) = \frac{-log(1-y)}{\theta} }[/math]
Now we can generate our random
sample [math]\displaystyle{ i=1\dots n }[/math] from [math]\displaystyle{ f(x) }[/math] by:
[math]\displaystyle{ 1)\ u_i \sim UNIF(0,1) }[/math]
[math]\displaystyle{ 2)\ x_i = \frac{-log(1-u_i)}{\theta} }[/math]
The [math]\displaystyle{ x_i }[/math] are now a random sample from [math]\displaystyle{ f(x) }[/math].
The major problem with this approach is that we have to find
[math]\displaystyle{ F^{-1} }[/math] and for many distributions it is too difficult to find the inverse of
[math]\displaystyle{ F(x) }[/math].