markov Chain Definitions: Difference between revisions

From statwiki
Jump to navigation Jump to search
(Created page with '**Rough Work** Markov chains. Can be used to determine probability of a state after a certain number of transition steps. In a Homogeneous Markov Chain the transition matrix do...')
 
(*rough work*)
Line 1: Line 1:
**Rough Work**
Practical application for estimation:
The general concept for the application of this lies within having a set of generated <math>x_i</math> which approach a distribution <math>f(x)</math> so that a variation of importance estimation can be used to estimate an integral in the form<br />
<math> I = \displaystyle\int^\ h(x)f(x)\,dx </math> by <math>\hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh(x_i)</math><br />
All that is required is a Markov chain which eventually converges to <math>f(x)</math>.
<br /><br />
In the previous example, the entries <math>p_{ij}</math> in the transition matrix <math>P</math> represent the probability of reaching state <math>j</math> from state <math>i</math> after one step.  For this reason, the sum over all entries j in a particular column sum to 1, as this itself must be a pmf if a transition from <math>i</math> is to lead to a state still within the state set for <math>X_t</math>.


Markov chains.
'''Homogeneous Markov Chain'''<br />
The probability matrix <math>P</math> is the same for all indicies <math>n\in T</math>.
<math>Pr(X_n=j|X_{n-1}=i)= Pr(X_1=j|X_0=i)</math>


Can be used to determine probability of a state after a certain number of transition steps.
If we denote the pmf of X_n by a probability vector <math>\frac{}{}\mu_n = [P(X_n=x_1),P(X_n=x_2),..,P(X_n=x_i)]</math> <br />
In a Homogeneous Markov Chain the transition matrix does not change over time.
where <math>i</math> denotes an ordered index of all possible states of <math>X</math>.<br />
Then we have a definition for the<br />
'''marginal probabilty''' <math>\frac{}{}\mu_n(i) = P(X_n=i)</math><br />
where we simplify <math>X_n</math> to represent the ordered index of a state rather than the state itself.
<br /><br />
From this definition it can be shown that,
<math>\frac{}{}\mu_{n-1}P=\mu_0P^n</math>


In an irreducible matrix the matrix is invertible and has no eigenvalues of 0. and at least one eigenvalue of 1.
<big>'''Proof:'''</big>


For the estimation of the integral I(h(x)f(x))dx we wish for the markov chain to approach a distribution f(x), and compute 1/N SUM(h(x_i)where x_i's are generated.
<math>\mu_{n-1}P=[\sum_{\forall i}(\mu_{n-1}(i))P_{i1},\sum_{\forall i}(\mu_{n-1}(i))P_{i2},..,\sum_{\forall i}(\mu_{n-1}(i))P_{ij}]</math>
and since
<blockquote>
<math>\sum_{\forall i}(\mu_{n-1}(i))P_{ij}=\sum_{\forall i}P(X_n=x_i)Pr(X_n=j|X_{n-1}=i)=\sum_{\forall i}P(X_n=x_i)\frac{Pr(X_n=j,X_{n-1}=i)}{P(X_n=x_i)}</math>
<math>=\sum_{\forall i}Pr(X_n=j,X_{n-1}=i)=Pr(X_n=j)=\mu_{n}(j)</math>


example of pseudo random number sequence.
</blockquote>
Therefore,<br />
<math>\frac{}{}\mu_{n-1}P=[\mu_{n}(1),\mu_{n}(2),...,\mu_{n}(i)]=\mu_{n}</math>
 
With this, it is possible to define <math>P(n)</math> as an n-step transition matrix where <math>\frac{}{}P_{ij}(n) = Pr(X_n=j|X_0=i)</math><br />
Theorem: <math>\frac{}{}\mu_n=\mu_0P^n</math><br />
Proof:  <math>\frac{}{}\mu_n=\mu_{n-1}P</math> From the previous conclusion<br />
<math>\frac{}{}=\mu_{n-2}PP=...=\mu_0\prod_{i = 1}^nP</math> And since this is a homogeneous Markov chain, <math>P</math> does not depend on <math>i</math> so<br />
<math>\frac{}{}=\mu_0P^n</math>
 
From this it becomes easy to define the n-step transition matrix as <math>\frac{}{}P(n)=P^n</math>
 
===Summary===
*'''transition matrix''' is an NxN when <math>N=|X|</math> matrix with <math>P_{ij}=Pr(X_1=j|X_0=i)</math> where <math>i,j \in X</math><br />
*'''n-step transition matrix''' also NxN with <math>P_{ij}(n)=Pr(X_n=j|X_0=i)</math><br />
*'''marginal (probability of X)'''<math>\mu_n(i) = Pr(X_n=i)</math><br />
*'''Theorem:''' <math>P_n=P^n</math><br />
*'''Theorem:''' <math>\mu_n=\mu_0P^n</math><br />
---
===Definitions of different types of state sets===
Define <math>i,j \in</math> State Space<br />
If <math>P_{ij}(n) > 0</math> for some <math>n</math> , then we say <math>i</math> reaches <math>j</math> denoted by <math>i\longrightarrow j</math> <br />
This also mean j is accessible by i <math>j\longleftarrow i</math> <br />
If <math>i\longrightarrow j</math> and <math>j\longrightarrow i</math> then we say <math>i</math> and <math>j</math> communicate, denoted by <math>i\longleftrightarrow j</math>
<br /><br />
'''Theorems'''<br />
1) <math>i\longleftrightarrow i</math><br />
2) <math>i\longleftrightarrow j,j\longleftrightarrow k\Rightarrow i\longleftrightarrow k</math><br />
3) The set of states of <math>X</math> can be written as a unique disjoint union of subsets <math>X=X_1\bigcup X_2\bigcup...\bigcupX_k,k>0 </math> where two states <math>i</math> and <math>j</math> communicate <math>IFF</math> they belong to the same subset
<br /><br />
'''More Definitions'''<br />
A set is '''Irreducible''' if all states communicate with each other (<math>\Leftrightarrow k=1</math>)<br />
A subset of states is '''Closed''' if once you enter it, you can never leave.<br />
An '''Absorbing set''' is a closed set with only 1 element<br />

Revision as of 22:22, 10 June 2009

Practical application for estimation: The general concept for the application of this lies within having a set of generated [math]\displaystyle{ x_i }[/math] which approach a distribution [math]\displaystyle{ f(x) }[/math] so that a variation of importance estimation can be used to estimate an integral in the form
[math]\displaystyle{ I = \displaystyle\int^\ h(x)f(x)\,dx }[/math] by [math]\displaystyle{ \hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh(x_i) }[/math]
All that is required is a Markov chain which eventually converges to [math]\displaystyle{ f(x) }[/math].

In the previous example, the entries [math]\displaystyle{ p_{ij} }[/math] in the transition matrix [math]\displaystyle{ P }[/math] represent the probability of reaching state [math]\displaystyle{ j }[/math] from state [math]\displaystyle{ i }[/math] after one step. For this reason, the sum over all entries j in a particular column sum to 1, as this itself must be a pmf if a transition from [math]\displaystyle{ i }[/math] is to lead to a state still within the state set for [math]\displaystyle{ X_t }[/math].

Homogeneous Markov Chain
The probability matrix [math]\displaystyle{ P }[/math] is the same for all indicies [math]\displaystyle{ n\in T }[/math]. [math]\displaystyle{ Pr(X_n=j|X_{n-1}=i)= Pr(X_1=j|X_0=i) }[/math]

If we denote the pmf of X_n by a probability vector [math]\displaystyle{ \frac{}{}\mu_n = [P(X_n=x_1),P(X_n=x_2),..,P(X_n=x_i)] }[/math]
where [math]\displaystyle{ i }[/math] denotes an ordered index of all possible states of [math]\displaystyle{ X }[/math].
Then we have a definition for the
marginal probabilty [math]\displaystyle{ \frac{}{}\mu_n(i) = P(X_n=i) }[/math]
where we simplify [math]\displaystyle{ X_n }[/math] to represent the ordered index of a state rather than the state itself.

From this definition it can be shown that, [math]\displaystyle{ \frac{}{}\mu_{n-1}P=\mu_0P^n }[/math]

Proof:

[math]\displaystyle{ \mu_{n-1}P=[\sum_{\forall i}(\mu_{n-1}(i))P_{i1},\sum_{\forall i}(\mu_{n-1}(i))P_{i2},..,\sum_{\forall i}(\mu_{n-1}(i))P_{ij}] }[/math] and since

[math]\displaystyle{ \sum_{\forall i}(\mu_{n-1}(i))P_{ij}=\sum_{\forall i}P(X_n=x_i)Pr(X_n=j|X_{n-1}=i)=\sum_{\forall i}P(X_n=x_i)\frac{Pr(X_n=j,X_{n-1}=i)}{P(X_n=x_i)} }[/math] [math]\displaystyle{ =\sum_{\forall i}Pr(X_n=j,X_{n-1}=i)=Pr(X_n=j)=\mu_{n}(j) }[/math]

Therefore,
[math]\displaystyle{ \frac{}{}\mu_{n-1}P=[\mu_{n}(1),\mu_{n}(2),...,\mu_{n}(i)]=\mu_{n} }[/math]

With this, it is possible to define [math]\displaystyle{ P(n) }[/math] as an n-step transition matrix where [math]\displaystyle{ \frac{}{}P_{ij}(n) = Pr(X_n=j|X_0=i) }[/math]
Theorem: [math]\displaystyle{ \frac{}{}\mu_n=\mu_0P^n }[/math]
Proof: [math]\displaystyle{ \frac{}{}\mu_n=\mu_{n-1}P }[/math] From the previous conclusion
[math]\displaystyle{ \frac{}{}=\mu_{n-2}PP=...=\mu_0\prod_{i = 1}^nP }[/math] And since this is a homogeneous Markov chain, [math]\displaystyle{ P }[/math] does not depend on [math]\displaystyle{ i }[/math] so
[math]\displaystyle{ \frac{}{}=\mu_0P^n }[/math]

From this it becomes easy to define the n-step transition matrix as [math]\displaystyle{ \frac{}{}P(n)=P^n }[/math]

Summary

  • transition matrix is an NxN when [math]\displaystyle{ N=|X| }[/math] matrix with [math]\displaystyle{ P_{ij}=Pr(X_1=j|X_0=i) }[/math] where [math]\displaystyle{ i,j \in X }[/math]
  • n-step transition matrix also NxN with [math]\displaystyle{ P_{ij}(n)=Pr(X_n=j|X_0=i) }[/math]
  • marginal (probability of X)[math]\displaystyle{ \mu_n(i) = Pr(X_n=i) }[/math]
  • Theorem: [math]\displaystyle{ P_n=P^n }[/math]
  • Theorem: [math]\displaystyle{ \mu_n=\mu_0P^n }[/math]

---

Definitions of different types of state sets

Define [math]\displaystyle{ i,j \in }[/math] State Space
If [math]\displaystyle{ P_{ij}(n) \gt 0 }[/math] for some [math]\displaystyle{ n }[/math] , then we say [math]\displaystyle{ i }[/math] reaches [math]\displaystyle{ j }[/math] denoted by [math]\displaystyle{ i\longrightarrow j }[/math]
This also mean j is accessible by i [math]\displaystyle{ j\longleftarrow i }[/math]
If [math]\displaystyle{ i\longrightarrow j }[/math] and [math]\displaystyle{ j\longrightarrow i }[/math] then we say [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] communicate, denoted by [math]\displaystyle{ i\longleftrightarrow j }[/math]

Theorems
1) [math]\displaystyle{ i\longleftrightarrow i }[/math]
2) [math]\displaystyle{ i\longleftrightarrow j,j\longleftrightarrow k\Rightarrow i\longleftrightarrow k }[/math]
3) The set of states of [math]\displaystyle{ X }[/math] can be written as a unique disjoint union of subsets [math]\displaystyle{ X=X_1\bigcup X_2\bigcup...\bigcupX_k,k\gt 0 }[/math] where two states [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] communicate [math]\displaystyle{ IFF }[/math] they belong to the same subset

More Definitions
A set is Irreducible if all states communicate with each other ([math]\displaystyle{ \Leftrightarrow k=1 }[/math])
A subset of states is Closed if once you enter it, you can never leave.
An Absorbing set is a closed set with only 1 element