http://wiki.math.uwaterloo.ca/statwiki/api.php?action=feedcontributions&user=Jsiswanto&feedformat=atomstatwiki - User contributions [US]2024-03-28T23:34:42ZUser contributionsMediaWiki 1.41.0http://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=200stat341 / CM 3612009-05-19T02:22:11Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves analyzing various distributions using computational methods. As a result, the probability distribution of each possible number appears to be uniform (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=199stat341 / CM 3612009-05-19T02:21:29Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves analyzing various distributions using computational methods. As a result, the probability distribution of each possible numbers appears to be uniform (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=198stat341 / CM 3612009-05-19T02:20:38Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves analyzing various distributions using computational methods. As a result, the probability distribution of each possible numbers appears to be random (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=197stat341 / CM 3612009-05-19T02:20:01Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves analyzing various distributions using computational methods. As a result, the probability distribution of each of all the possible numbers appears to be random (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=196stat341 / CM 3612009-05-19T02:19:06Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves analyzing various distributions using computational methods. As a result, the probability distribution of each of these numbers appears to be random (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=stat341_/_CM_361&diff=195stat341 / CM 3612009-05-19T02:16:19Z<p>Jsiswanto: /* Lecture of May 12th, 2009 */</p>
<hr />
<div>'''Computational Statistics and Data Analysis''' is a course offered at the University of Waterloo<br /><br />
Spring 2009<br /><br />
Instructor: Ali Ghodsi <br />
<br />
<br />
<br />
==Sampling (Generating random numbers)==<br />
<br />
===Lecture of May 12th, 2009===<br />
<br />
Generating random numbers in a computational setting presents challenges. A good way to generate random numbers in computational statistics involves various distributions using computational methods. As a result, the probability distribution of each of these numbers appears to be random (pseudo-random). Outside a computational setting, presenting a uniform distribution is fairly easy (for example, rolling a fair die repetitively to produce a series of random numbers from 1 to 6).<br />
<br />
We begin by considering the simplest case: the uniform distribution.<br />
<br />
====Multiplicative Congruential Method====<br />
<br />
One way to generate pseudo random numbers from the uniform distribution is using the '''Multiplicative Congruential Method'''. This involves three integer parameters ''a'', ''b'', and ''m'', and a '''seed''' variable ''x<sub>0</sub>''. This method deterministically generates a sequence of numbers (based on the seed) with a seemingly random distribution (with some caveats). It proceeds as follows:<br />
<br />
:<math>x_{i+1} = (ax_{i} + b) \mod{m}</math><br />
<br />
For example, with ''a'' = 13, ''b'' = 0, ''m'' = 31, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = 13x_{i} \mod{31}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 13 \times 1 + 0 \mod{31} = 13 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 13 \times 13 + 0 \mod{31} = 14 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{3} &{}= 13 \times 14 + 0 \mod{31} =27 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
For correctly chosen values of ''a'', ''b'', and ''m'', this method will generate a sequence of integers including all integers between 0 and ''m'' - 1. Scaling the output by dividing the terms of the resulting sequence by ''m - 1'', we create a sequence of numbers between 0 and 1, which is similar to sampling from a uniform distribution.<br />
<br />
Of course, not all values of ''a'', ''b'', and ''m'' will behave in this way, and will not be suitable for use in generating pseudo random numbers. <br />
<br />
For example, with ''a'' = 3, ''b'' = 2, ''m'' = 4, ''x<sub>0</sub>'' = 1, we have:<br />
<br />
:<math>x_{i+1} = (3x_{i} + 2) \mod{4}</math><br />
<br />
So,<br />
<br />
:<math>\begin{align} x_{0} &{}= 1 \end{align}</math><br />
:<math>\begin{align}<br />
x_{1} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
:<math>\begin{align}<br />
x_{2} &{}= 3 \times 1 + 2 \mod{4} = 1 \\<br />
\end{align}</math><br />
<br />
etc.<br />
<br />
In practice, it has been found by a paper published in 1988 by Park and Miller, that ''a'' = 7<sup>5</sup>, ''b'' = 0, and ''m'' = 2<sup>31</sup> - 1 = 2147483647 (the maximum size of a signed integer in a 32-bit system) are good values for the Multiplicative Congruential Method.<br />
<br />
Java's Random class is based on a generator with ''a'' = 25214903917, ''b'' = 11, and ''m'' = 2<sup>48</sup><ref>http://java.sun.com/javase/6/docs/api/java/util/Random.html#next(int)</ref>. The class returns at most 32 leading bits from each ''x<sub>i</sub>'', so it is possible to get the same value twice in a row, (when ''x<sub>0</sub>'' = 18698324575379, for instance) without repeating it forever.<br />
<br />
====General Methods====<br />
<br />
Since the Multiplicative Congruential Method can only be used for the uniform distribution, other methods must be developed in order to generate pseudo random numbers from other distributions.<br />
<br />
=====Inverse Transform Method=====<br />
<br />
This method uses the fact that when a random sample from the uniform distribution is applied to the inverse of a cumulative density function (cdf) of some distribution, the result is a random sample of that distribution. This is shown by this theorem:<br />
<br />
'''Theorem''':<br />
<br />
If <math>U \sim~ \mathrm{Unif}[0, 1]</math> is a random variable and <math>X = F^{-1}(U)</math>, where F is continuous, monotonic, and is the cumulative density function (cdf) for some distribution, then the distribution of the random variable X is given by F(X).<br />
<br />
'''Proof''':<br />
<br />
Recall that, if ''f'' is the pdf corresponding to F, then,<br />
<br />
:<math>F(x) = P(X \leq x) = \int_{-\infty}^x f(x)</math><br />
<br />
So F is monotonically increasing, since the probability that X is less than a greater number must be greater than the probability that X is less than a lesser number.<br />
<br />
Note also that in the uniform distribution on [0, 1], we have for all ''a'' within [0, 1], <math>P(U \leq a) = a</math>.<br />
<br />
So,<br />
<br />
:<math>\begin{align}<br />
P(F^{-1}(U) \leq x) &{}= P(F(F^{-1}(U)) \leq F(x)) \\<br />
&{}= P(U \leq F(x)) \\<br />
&{}= F(x)<br />
\end{align}</math><br />
<br />
Completing the proof.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
This method then gives us the following procedure for finding pseudo random numbers from a continuous distribution:<br />
<br />
*Step 1: Draw <math>U \sim~ Unif [0, 1] </math>.<br />
*Step 2: Compute <math> X = F^{-1}(U) </math>.<br />
<br />
'''Example''':<br />
<br />
Suppose we want to draw a sample from <math>f(x) = \lambda e^{-\lambda x} </math> where <math>x > 0</math> (the exponential distribution).<br />
<br />
We need to first find <math>F(x)</math> and then its inverse, <math>F^{-1}</math>.<br />
<br />
:<math> F(x) = \int^x_0 \theta e^{-\theta u} du = 1 - e^{-\theta x} </math><br />
<br />
:<math> F^{-1}(x) = \frac{-\log(1-y)}{\theta} = \frac{-\log(u)}{\theta} </math><br />
<br />
Now we can generate our random sample <math>i=1\dots n</math> from <math>f(x)</math> by:<br />
<br />
:<math>1)\ u_i \sim Unif[0, 1]</math><br />
:<math>2)\ x_i = \frac{-\log(u_i)}{\theta}</math><br />
<br />
The <math>x_i</math> are now a random sample from <math>f(x)</math>.<br />
<br />
<br />
This example can be illustrated in Matlab using the code below. Generate <math>u_i</math>, calculate <math>x_i</math> using the above formula and letting <math>\theta=1</math>, plot the histogram of <math>x_i</math>'s for <math>i=1,...,100,000</math>.<br />
<br />
u=rand(1,100000);<br />
x=-log(1-u)/1;<br />
hist(x)<br />
<br />
The histogram shows exponential pattern as expected.<br />
<br />
[[File:HistRandNum.jpg]]<br />
<br />
The major problem with this approach is that we have to find <math>F^{-1}</math> and for many distributions it is too difficult (or impossible) to find the inverse of <math>F(x)</math>. Further, for some distributions it is not even possible to find <math>F(x)</math> (i.e. a closed form expression for the distribution function, or otherwise; even if the closed form expression exists, it's usually difficult to find <math>F^{-1}</math>).<br />
<br />
'''Procedure (Discrete Case)'''<br />
<br />
The above method can be easily adapted to work on discrete distributions as well.<br />
<br />
In general in the discrete case, we have <math>x_0, \dots , x_n</math> where:<br />
<br />
:<math>\begin{align}P(X = x_i) &{}= p_i \end{align}</math><br />
:<math>x_0 \leq x_1 \leq x_2 \dots \leq x_n</math><br />
:<math>\sum p_i = 1</math><br />
<br />
Thus we can define the following method to find pseudo random numbers in the discrete case (note that the less-than signs from class have been changed to less-than-or-equal-to signs by me, since otherwise the case of <math>U = 1</math> is missed):<br />
<br />
*Step 1: Draw <math> U~ \sim~ Unif [0,1] </math>.<br />
*Step 2:<br />
**If <math>U < p_0</math>, return <math>X = x_0</math><br />
**If <math>p_0 \leq U < p_0 + p_1</math>, return <math>X = x_1</math><br />
** ...<br />
**In general, if <math>p_0+ p_1 + \dots + p_{k-1} \leq U < p_0 + \dots + p_k</math>, return <math>X = x_k</math><br />
<br />
'''Example''' (from class):<br />
<br />
Suppose we have the following discrete distribution:<br />
<br />
:<math>\begin{align}<br />
P(X = 0) &{}= 0.3 \\<br />
P(X = 1) &{}= 0.2 \\<br />
P(X = 2) &{}= 0.5<br />
\end{align}</math><br />
<br />
The cumulative density function (cdf) for this distribution is then:<br />
<br />
:<math><br />
F(x) = \begin{cases}<br />
0, & \text{if } x < 0 \\<br />
0.3, & \text{if } 0 \leq x < 1 \\<br />
0.5, & \text{if } 1 \leq x < 2 \\<br />
1, & \text{if } 2 \leq x<br />
\end{cases}</math><br />
<br />
Then we can generate numbers from this distribution like this, given <math>u_0, \dots, u_n</math> from <math>U \sim~ Unif[0, 1]</math>:<br />
<br />
:<math><br />
x_i = \begin{cases}<br />
0, & \text{if } u_i \leq 0.3 \\<br />
1, & \text{if } 0.3 < u_i \leq 0.5 \\<br />
2, & \text{if } 0.5 < u_i \leq 1<br />
\end{cases}</math><br />
<br />
This example can be illustrated in Matlab using the code below:<br />
<br />
p=[0.3,0.2,0.5];<br />
for i=1:1000;<br />
u=rand;<br />
if u <= p(1)<br />
x(i)=0;<br />
elseif u < sum(p(1,2))<br />
x(i)=1;<br />
else<br />
x(i)=2;<br />
end<br />
end<br />
<br />
===Lecture of May 14th, 2009===<br />
<br />
Today, we continue the discussion on sampling (generating random numbers) from general distributions with the '''Acceptance/Rejection Method'''.<br />
<br />
====Acceptance/Rejection Method====<br />
<br />
Suppose we wish to sample from a target distribution <math>f(x)</math> that is difficult or impossible to sample from directly. Suppose also that we have a proposal distribution <math>g(x)</math> from which we have a reasonable method of sampling (e.g. the uniform distribution). Then, if there is a constant <math>c \ |\ c \cdot g(x) \geq f(x)\ \forall x</math>, accepting samples drawn in successions from the distribution <math>g(x)</math> with probability :<math> \frac {f(x)}{c \cdot g(x)} </math> will yield a sample that follows the target distribution <math>f(x)</math>.<br />
<br />
'''Proof'''<br />
<br />
Note the following:<br />
*<math> Pr(X|accept) = \frac{Pr(accept|X) \times Pr(X)}{Pr(accept)} </math> (Bayes' theorem)<br />
*<math> Pr(accept|X) = \frac{f(x)}{c \cdot g(x)} </math><br />
*<math> Pr(X) = g(x)\frac{}{}</math><br />
<br />
So,<br />
<math> Pr(accept) = \int^{}_x Pr(accept|X) \times Pr(X) dx </math><br />
<math> = \int^{}_x \frac{f(x)}{c \cdot g(x)} g(x) dx </math><br />
<math> = \frac{1}{c} \int^{}_x f(x) dx </math><br />
<math> = \frac{1}{c} </math><br />
<br />
Therefore,<br />
<math> Pr(X|accept) = \frac{\frac{f(x)}{c\ \cdot g(x)} \times g(x)}{\frac{1}{c}} = f(x) </math> as required.<br />
<br />
'''Procedure (Continuous Case)'''<br />
<br />
*Choose <math>g(x)</math> (a density function that is simple to sample from)<br />
*Find a constant c such that :<math> c \cdot g(x) \geq f(x) </math><br />
#Let <math>Y \sim~ g(y)</math> <br />
#Let <math>U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{c \cdot g(x)}</math> then X=Y; else reject and go to 1<br />
<br />
'''Example:'''<br />
<br />
Suppose we want to sample from Beta(2,1), for <math> 0 \leq x \leq 1 </math>.<br />
Recall:<br />
:<math> Beta(2,1) = \frac{\Gamma (2+1)}{\Gamma (2) \Gamma(1)} \times x^1(1-x)^0 = \frac{2!}{1!0!} \times x = 2x </math><br />
*Choose <math> g(x) \sim~ Unif [0,1] </math><br />
*Find c: c = 2 (see notes below)<br />
#Let <math> Y \sim~ Unif [0,1] </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{2Y}{2} = Y </math>, then X=Y; else go to 1<br />
<br />
<math>c</math> was chosen to be 2 in this example since <math> \max \left(\frac{f(x)}{g(x)}\right) </math> in this example is 2. This condition is important since it helps us in finding a suitable <math>c</math> to apply the Acceptance/Rejection Method.<br />
<br />
<br />
In MATLAB, the code that demonstrates the result of this example is:<br />
<br />
j = 1;<br />
while i < 1000<br />
y = rand;<br />
u = rand;<br />
if u <= y<br />
x(j) = y;<br />
j = j + 1;<br />
i = i + 1;<br />
end<br />
end<br />
hist(x);<br />
<br />
<br />
The histogram produced here should follow the target distribution, <math>f(x) = 2x</math>, for the interval <math> 0 \leq x \leq 1 </math>.<br />
<br />
The histogram shows the PDF of a Beta(2,1) distribution as expected.<br />
<br />
[[File:BetaDistn.jpg]]<br />
<br />
<br />
'''The Discrete Case'''<br />
<br />
The Acceptance/Rejection Method can be extended for discrete target distributions. The difference compared to the continuous case is that the proposal distribution <math>g(x)</math> must also be discrete distribution. For our purposes, we can consider g(x) to be a discrete uniform distribution on the set of values that X may take on in the target distribution.<br />
<br />
'''Example'''<br />
<br />
Suppose we want to sample from a distribution with the following probability mass function (pmf):<br />
P(X=1) = 0.15<br />
P(X=2) = 0.55<br />
P(X=3) = 0.20<br />
P(X=4) = 0.10 <br />
*Choose <math>g(x)</math> to be the discrete uniform distribution on the set <math>\{1,2,3,4\}</math><br />
*Find c: <math> c = \max \left(\frac{f(x)}{g(x)} \right)= 0.55/0.25 = 2.2 </math><br />
#Generate <math> Y \sim~ Unif \{1,2,3,4\} </math><br />
#Let <math> U \sim~ Unif [0,1] </math><br />
#If <math>U \leq \frac{f(x)}{2.2 \times 0.25} </math>, then X=Y; else go to 1<br />
<br />
'''Limitations'''<br />
<br />
If the proposed distribution is very different from the target distribution, we may have to rejected a large number of points before a good sample size of the target distribution can be established. <br />
<br />
We will now begin to discuss sampling from specific distributions.<br />
<br />
====Review of Gamma Distribution====<br />
<br />
Recall that the cdf of the Gamma distribution is:<br />
<br />
<math> F(x) = \int_0^{\lambda x} \frac{e^{-y}y^{t-1}}{(t-1)!} dy </math><br />
<br />
If we wish to sample from this distribution, it will be difficult for both the Inverse Method and the Acceptance/Rejection Method.<br />
<br />
<br />
'''Additive Property of Gamma Distribution'''<br />
<br />
Recall that if <math>X_1, \dots, X_t</math> are independent exponential distributions with mean <math> \lambda </math> (in other words, <math> X_i\sim~ Exp (\lambda) </math>), then <math> \Sigma_{i=1}^t X_i \sim~ Gamma (t, \lambda) </math><br />
<br />
It appears that if we want to sample from the Gamma distribution, we can consider sampling from t independent exponential distributions with mean <math> \lambda </math> (using the Inverse Method) and add them up. Details will be discussed in the next lecture.</div>Jsiswantohttp://wiki.math.uwaterloo.ca/statwiki/index.php?title=schedule&diff=148schedule2009-05-13T19:28:48Z<p>Jsiswanto: </p>
<hr />
<div>{| class="wikitable"<br />
<br />
{| border="1" cellpadding="2"<br />
|-<br />
|width="100pt"|Date<br />
|width="200pt"|Name<br />
|-<br />
|May 12 || Jeremy Sharpe<br />
|-<br />
|May 14 || Keith, Ho Chi Lam <br />
|-<br />
|May 19 || Mathieu Zerter <br />
|-<br />
|May 21 || Jeff Li <br />
|-<br />
|May 26 || Raghav Malik <br />
|-<br />
|May 28 || Laura Chelaru <br />
|-<br />
|June 2 || Timothy Choy <br />
|-<br />
|June 4 || Wenjing Zhao<br />
|-<br />
|June 9 || Mark Stuart <br />
|-<br />
|June 11 || Alberto Carignano <br />
|-<br />
|June 16 || Jeff Siswanto<br />
|-<br />
|June 18|| Iulia Pargaru <br />
|-<br />
|June 23 || Your name <br />
|-<br />
|June 25|| Your name <br />
|-<br />
|June 30|| Your name <br />
|-<br />
|July 2|| Your name<br />
|-<br />
|July 7|| Your name<br />
|-<br />
|July 9|| Your name<br />
|-<br />
|July 14|| Your name<br />
|-<br />
|July 16|| Your name<br />
|-<br />
|July 21|| Your name<br />
|-<br />
|July 23|| Your name<br />
|-<br />
|July 28|| Your name<br />
|}</div>Jsiswanto