importance Sampling June 2 2009

From statwiki
Revision as of 09:52, 3 June 2009 by Ichelaru (talk | contribs)
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Importance Sampling

In [math]\displaystyle{ I = \displaystyle\int h(x)f(x)\,dx }[/math], Monte Carlo simulation can be used only if it easy to sample from f(x). Otherwise, another method must be applied. If sampling from f(x) is difficult but there exists a probability distribution function g(x) which is easy to sample from, then [math]\displaystyle{ I }[/math] can be written as

[math]\displaystyle{ I = \displaystyle\int h(x)f(x)\,dx }[/math]
[math]\displaystyle{ = \displaystyle\int \frac{h(x)f(x)}{g(x)}g(x)\,dx }[/math]
[math]\displaystyle{ = \displaystyle E_g(w(x)) \rightarrow }[/math]the expectation of w(x) with respect to g(x)
[math]\displaystyle{ = \frac{\displaystyle\sum_{i=1}^{N} w(x_i)}{N} }[/math] where [math]\displaystyle{ \displaystyle w(x) = \frac{h(x)f(x)}{g(x)} }[/math]

Process

  1. Choose [math]\displaystyle{ \displaystyle g(x) }[/math] such that it's easy to sample from.
  2. Compute [math]\displaystyle{ \displaystyle w(x)=\frac{h(x)f(x)}{g(x)} }[/math]
  3. [math]\displaystyle{ \displaystyle \hat{I} = \frac{\displaystyle\sum_{i=1}^{N} w(x_i)}{N} }[/math]

"Weighted" average

The term "importance sampling" is used to describe this method because a higher 'importance' or 'weighting' is given to the values sampled from [math]\displaystyle{ \displaystyle g(x) }[/math] that are closer to [math]\displaystyle{ \displaystyle f(x) }[/math], the original distribution we would ideally like to sample from (but cannot because it is too difficult).
[math]\displaystyle{ \displaystyle I = \int\frac{h(x)f(x)}{g(x)}g(x)\,dx }[/math]
[math]\displaystyle{ =\displaystyle \int \frac{f(x)}{g(x)}h(x)g(x)\,dx }[/math]
[math]\displaystyle{ =\displaystyle \int \frac{f(x)}{g(x)}E_g(h(x))\,dx }[/math] which is the same thing as saying that we are applying a regular Monte Carlo Simulation method to [math]\displaystyle{ \displaystyle\int h(x)g(x)\,dx }[/math], taking each result from this process and weighting the more accurate ones (i.e. the ones for which [math]\displaystyle{ \displaystyle \frac{f(x)}{g(x)} }[/math] is high) higher.

One can view [math]\displaystyle{ \frac{f(x)}{g(x)}\ = B(x) }[/math] as a weight.

Then [math]\displaystyle{ \displaystyle \hat{I} = \frac{\displaystyle\sum_{i=1}^{N} w(x_i)}{N} = \frac{\displaystyle\sum_{i=1}^{N} B(x_i)*h(x_i)}{N} }[/math]

i.e. we are computing a weighted sum of [math]\displaystyle{ h(x_i) }[/math] instead of a sum.