markov Chain Definitions: Difference between revisions
(Created page with '**Rough Work** Markov chains. Can be used to determine probability of a state after a certain number of transition steps. In a Homogeneous Markov Chain the transition matrix do...') |
(*rough work*) |
||
Line 1: | Line 1: | ||
Practical application for estimation: | |||
The general concept for the application of this lies within having a set of generated <math>x_i</math> which approach a distribution <math>f(x)</math> so that a variation of importance estimation can be used to estimate an integral in the form<br /> | |||
<math> I = \displaystyle\int^\ h(x)f(x)\,dx </math> by <math>\hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh(x_i)</math><br /> | |||
All that is required is a Markov chain which eventually converges to <math>f(x)</math>. | |||
<br /><br /> | |||
In the previous example, the entries <math>p_{ij}</math> in the transition matrix <math>P</math> represent the probability of reaching state <math>j</math> from state <math>i</math> after one step. For this reason, the sum over all entries j in a particular column sum to 1, as this itself must be a pmf if a transition from <math>i</math> is to lead to a state still within the state set for <math>X_t</math>. | |||
Markov | '''Homogeneous Markov Chain'''<br /> | ||
The probability matrix <math>P</math> is the same for all indicies <math>n\in T</math>. | |||
<math>Pr(X_n=j|X_{n-1}=i)= Pr(X_1=j|X_0=i)</math> | |||
If we denote the pmf of X_n by a probability vector <math>\frac{}{}\mu_n = [P(X_n=x_1),P(X_n=x_2),..,P(X_n=x_i)]</math> <br /> | |||
where <math>i</math> denotes an ordered index of all possible states of <math>X</math>.<br /> | |||
Then we have a definition for the<br /> | |||
'''marginal probabilty''' <math>\frac{}{}\mu_n(i) = P(X_n=i)</math><br /> | |||
where we simplify <math>X_n</math> to represent the ordered index of a state rather than the state itself. | |||
<br /><br /> | |||
From this definition it can be shown that, | |||
<math>\frac{}{}\mu_{n-1}P=\mu_0P^n</math> | |||
<big>'''Proof:'''</big> | |||
<math>\mu_{n-1}P=[\sum_{\forall i}(\mu_{n-1}(i))P_{i1},\sum_{\forall i}(\mu_{n-1}(i))P_{i2},..,\sum_{\forall i}(\mu_{n-1}(i))P_{ij}]</math> | |||
and since | |||
<blockquote> | |||
<math>\sum_{\forall i}(\mu_{n-1}(i))P_{ij}=\sum_{\forall i}P(X_n=x_i)Pr(X_n=j|X_{n-1}=i)=\sum_{\forall i}P(X_n=x_i)\frac{Pr(X_n=j,X_{n-1}=i)}{P(X_n=x_i)}</math> | |||
<math>=\sum_{\forall i}Pr(X_n=j,X_{n-1}=i)=Pr(X_n=j)=\mu_{n}(j)</math> | |||
</blockquote> | |||
Therefore,<br /> | |||
<math>\frac{}{}\mu_{n-1}P=[\mu_{n}(1),\mu_{n}(2),...,\mu_{n}(i)]=\mu_{n}</math> | |||
With this, it is possible to define <math>P(n)</math> as an n-step transition matrix where <math>\frac{}{}P_{ij}(n) = Pr(X_n=j|X_0=i)</math><br /> | |||
Theorem: <math>\frac{}{}\mu_n=\mu_0P^n</math><br /> | |||
Proof: <math>\frac{}{}\mu_n=\mu_{n-1}P</math> From the previous conclusion<br /> | |||
<math>\frac{}{}=\mu_{n-2}PP=...=\mu_0\prod_{i = 1}^nP</math> And since this is a homogeneous Markov chain, <math>P</math> does not depend on <math>i</math> so<br /> | |||
<math>\frac{}{}=\mu_0P^n</math> | |||
From this it becomes easy to define the n-step transition matrix as <math>\frac{}{}P(n)=P^n</math> | |||
===Summary=== | |||
*'''transition matrix''' is an NxN when <math>N=|X|</math> matrix with <math>P_{ij}=Pr(X_1=j|X_0=i)</math> where <math>i,j \in X</math><br /> | |||
*'''n-step transition matrix''' also NxN with <math>P_{ij}(n)=Pr(X_n=j|X_0=i)</math><br /> | |||
*'''marginal (probability of X)'''<math>\mu_n(i) = Pr(X_n=i)</math><br /> | |||
*'''Theorem:''' <math>P_n=P^n</math><br /> | |||
*'''Theorem:''' <math>\mu_n=\mu_0P^n</math><br /> | |||
--- | |||
===Definitions of different types of state sets=== | |||
Define <math>i,j \in</math> State Space<br /> | |||
If <math>P_{ij}(n) > 0</math> for some <math>n</math> , then we say <math>i</math> reaches <math>j</math> denoted by <math>i\longrightarrow j</math> <br /> | |||
This also mean j is accessible by i <math>j\longleftarrow i</math> <br /> | |||
If <math>i\longrightarrow j</math> and <math>j\longrightarrow i</math> then we say <math>i</math> and <math>j</math> communicate, denoted by <math>i\longleftrightarrow j</math> | |||
<br /><br /> | |||
'''Theorems'''<br /> | |||
1) <math>i\longleftrightarrow i</math><br /> | |||
2) <math>i\longleftrightarrow j,j\longleftrightarrow k\Rightarrow i\longleftrightarrow k</math><br /> | |||
3) The set of states of <math>X</math> can be written as a unique disjoint union of subsets <math>X=X_1\bigcup X_2\bigcup...\bigcupX_k,k>0 </math> where two states <math>i</math> and <math>j</math> communicate <math>IFF</math> they belong to the same subset | |||
<br /><br /> | |||
'''More Definitions'''<br /> | |||
A set is '''Irreducible''' if all states communicate with each other (<math>\Leftrightarrow k=1</math>)<br /> | |||
A subset of states is '''Closed''' if once you enter it, you can never leave.<br /> | |||
An '''Absorbing set''' is a closed set with only 1 element<br /> |
Revision as of 21:22, 10 June 2009
Practical application for estimation:
The general concept for the application of this lies within having a set of generated [math]\displaystyle{ x_i }[/math] which approach a distribution [math]\displaystyle{ f(x) }[/math] so that a variation of importance estimation can be used to estimate an integral in the form
[math]\displaystyle{ I = \displaystyle\int^\ h(x)f(x)\,dx }[/math] by [math]\displaystyle{ \hat{I} = \frac{1}{N}\displaystyle\sum_{i=1}^Nh(x_i) }[/math]
All that is required is a Markov chain which eventually converges to [math]\displaystyle{ f(x) }[/math].
In the previous example, the entries [math]\displaystyle{ p_{ij} }[/math] in the transition matrix [math]\displaystyle{ P }[/math] represent the probability of reaching state [math]\displaystyle{ j }[/math] from state [math]\displaystyle{ i }[/math] after one step. For this reason, the sum over all entries j in a particular column sum to 1, as this itself must be a pmf if a transition from [math]\displaystyle{ i }[/math] is to lead to a state still within the state set for [math]\displaystyle{ X_t }[/math].
Homogeneous Markov Chain
The probability matrix [math]\displaystyle{ P }[/math] is the same for all indicies [math]\displaystyle{ n\in T }[/math].
[math]\displaystyle{ Pr(X_n=j|X_{n-1}=i)= Pr(X_1=j|X_0=i) }[/math]
If we denote the pmf of X_n by a probability vector [math]\displaystyle{ \frac{}{}\mu_n = [P(X_n=x_1),P(X_n=x_2),..,P(X_n=x_i)] }[/math]
where [math]\displaystyle{ i }[/math] denotes an ordered index of all possible states of [math]\displaystyle{ X }[/math].
Then we have a definition for the
marginal probabilty [math]\displaystyle{ \frac{}{}\mu_n(i) = P(X_n=i) }[/math]
where we simplify [math]\displaystyle{ X_n }[/math] to represent the ordered index of a state rather than the state itself.
From this definition it can be shown that,
[math]\displaystyle{ \frac{}{}\mu_{n-1}P=\mu_0P^n }[/math]
Proof:
[math]\displaystyle{ \mu_{n-1}P=[\sum_{\forall i}(\mu_{n-1}(i))P_{i1},\sum_{\forall i}(\mu_{n-1}(i))P_{i2},..,\sum_{\forall i}(\mu_{n-1}(i))P_{ij}] }[/math] and since
[math]\displaystyle{ \sum_{\forall i}(\mu_{n-1}(i))P_{ij}=\sum_{\forall i}P(X_n=x_i)Pr(X_n=j|X_{n-1}=i)=\sum_{\forall i}P(X_n=x_i)\frac{Pr(X_n=j,X_{n-1}=i)}{P(X_n=x_i)} }[/math] [math]\displaystyle{ =\sum_{\forall i}Pr(X_n=j,X_{n-1}=i)=Pr(X_n=j)=\mu_{n}(j) }[/math]
Therefore,
[math]\displaystyle{ \frac{}{}\mu_{n-1}P=[\mu_{n}(1),\mu_{n}(2),...,\mu_{n}(i)]=\mu_{n} }[/math]
With this, it is possible to define [math]\displaystyle{ P(n) }[/math] as an n-step transition matrix where [math]\displaystyle{ \frac{}{}P_{ij}(n) = Pr(X_n=j|X_0=i) }[/math]
Theorem: [math]\displaystyle{ \frac{}{}\mu_n=\mu_0P^n }[/math]
Proof: [math]\displaystyle{ \frac{}{}\mu_n=\mu_{n-1}P }[/math] From the previous conclusion
[math]\displaystyle{ \frac{}{}=\mu_{n-2}PP=...=\mu_0\prod_{i = 1}^nP }[/math] And since this is a homogeneous Markov chain, [math]\displaystyle{ P }[/math] does not depend on [math]\displaystyle{ i }[/math] so
[math]\displaystyle{ \frac{}{}=\mu_0P^n }[/math]
From this it becomes easy to define the n-step transition matrix as [math]\displaystyle{ \frac{}{}P(n)=P^n }[/math]
Summary
- transition matrix is an NxN when [math]\displaystyle{ N=|X| }[/math] matrix with [math]\displaystyle{ P_{ij}=Pr(X_1=j|X_0=i) }[/math] where [math]\displaystyle{ i,j \in X }[/math]
- n-step transition matrix also NxN with [math]\displaystyle{ P_{ij}(n)=Pr(X_n=j|X_0=i) }[/math]
- marginal (probability of X)[math]\displaystyle{ \mu_n(i) = Pr(X_n=i) }[/math]
- Theorem: [math]\displaystyle{ P_n=P^n }[/math]
- Theorem: [math]\displaystyle{ \mu_n=\mu_0P^n }[/math]
---
Definitions of different types of state sets
Define [math]\displaystyle{ i,j \in }[/math] State Space
If [math]\displaystyle{ P_{ij}(n) \gt 0 }[/math] for some [math]\displaystyle{ n }[/math] , then we say [math]\displaystyle{ i }[/math] reaches [math]\displaystyle{ j }[/math] denoted by [math]\displaystyle{ i\longrightarrow j }[/math]
This also mean j is accessible by i [math]\displaystyle{ j\longleftarrow i }[/math]
If [math]\displaystyle{ i\longrightarrow j }[/math] and [math]\displaystyle{ j\longrightarrow i }[/math] then we say [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] communicate, denoted by [math]\displaystyle{ i\longleftrightarrow j }[/math]
Theorems
1) [math]\displaystyle{ i\longleftrightarrow i }[/math]
2) [math]\displaystyle{ i\longleftrightarrow j,j\longleftrightarrow k\Rightarrow i\longleftrightarrow k }[/math]
3) The set of states of [math]\displaystyle{ X }[/math] can be written as a unique disjoint union of subsets [math]\displaystyle{ X=X_1\bigcup X_2\bigcup...\bigcupX_k,k\gt 0 }[/math] where two states [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] communicate [math]\displaystyle{ IFF }[/math] they belong to the same subset
More Definitions
A set is Irreducible if all states communicate with each other ([math]\displaystyle{ \Leftrightarrow k=1 }[/math])
A subset of states is Closed if once you enter it, you can never leave.
An Absorbing set is a closed set with only 1 element