f11Stat946ass

From statwiki
Jump to navigation Jump to search

problem 1

Recall the definition of the family of probability distributions associated with a directed graphical model:

[math]\displaystyle{ p(x) = \prod_{i=1}^{n}f_i(x_i,x_{\pi_i}). }[/math]

Prove by induction that [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math].

solution 1

According to the product rule we have:

[math]\displaystyle{ p(x_1,\ldots,x_{k+1})=p(x_1,\ldots,x_{k})p(x_{k+1}|x_1,\ldots,x_{k}).~\sharp }[/math]

This is the most general case for a directed graph, as we can represent each and every graphical model with a fully connected graph.

Now, we need to show that this statement, [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math], holds for [math]\displaystyle{ n=1 }[/math] (and [math]\displaystyle{ n=2 }[/math], just for the sake of more clarity).

[math]\displaystyle{ \begin{align} n=&1:~p(x)=p(x_1)=p(x_1|\emptyset)=f_1(x_1,x_{\pi_1}), \mbox{ where } x_{\pi_1}=\emptyset \rightarrow p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}),\\ n=&2:~p(x)=p(x_1,x_2)=p(x_1)p(x_2|x_1)=p(x_1|\emptyset)p(x_2|x_1)=f_1(x_1,x_{\pi_2})f_2(x_2,x_{\pi_2})\rightarrow p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}). \end{align} }[/math]

Where in the latter case, without loss of generality, we have assumed that [math]\displaystyle{ x_1 }[/math] is the only parent of [math]\displaystyle{ x_2 }[/math]. This assumption that we've taken, not only does not reduce from the generality of the proof (as we have initially assumed that the graph is directed and in a directed graph one of two connected nodes/random variables should be a parent of the other), but it also abides by our earlier assumption, that parents come before children. This leaves no parents for [math]\displaystyle{ x_1 }[/math], similar to the former case in the above.

Now, assuming that the statement [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math] holds for [math]\displaystyle{ n=k }[/math],

[math]\displaystyle{ \begin{align} n=k: p(x)=p(x_1,x_2,\ldots,x_k)=\prod_{i=1}^{k}f_i(x_i,x_{\pi_i})=\prod_{i=1}^{k}p(x_i|x_{\pi_i}).~\flat \end{align} }[/math]

we need to show that it also holds for [math]\displaystyle{ n=k+1 }[/math].

[math]\displaystyle{ \begin{align} n=k+1: p(x)&=p(x_1,x_2,\ldots,x_{k+1})=\prod_{i=1}^{k+1}f_i(x_i,x_{\pi_i}) \\ &=f_{k+1}(x_{k+1},x_{\pi_{k+1}})\prod_{i=1}^{k}f_i(x_i,x_{\pi_i}) \\ &\stackrel{\flat}{=}f_{k+1}(x_{k+1},x_{\pi_{k+1}})\prod_{i=1}^{k}p(x_i|x_{\pi_i}). \end{align} }[/math]

On the other hand according to [math]\displaystyle{ \sharp }[/math] we have:

[math]\displaystyle{ \begin{align} p(x_1,\ldots,x_{k+1})=p(x_1,\ldots,x_{k})p(x_{k+1}|x_1,\ldots,x_{k}) \end{align} }[/math]

Thus,

[math]\displaystyle{ \begin{align} f_{k+1}(x_{k+1},x_{\pi_{k+1}})\prod_{i=1}^{k}p(x_i|x_{\pi_i})=p(x_{k+1}|x_1,\ldots,x_{k})p(x_1,\ldots,x_{k}). \end{align} }[/math]

Once more using [math]\displaystyle{ \flat }[/math], we have:

[math]\displaystyle{ \begin{align} f_{k+1}(x_{k+1},x_{\pi_{k+1}})=p(x_{k+1}|x_1,\ldots,x_{k}), \end{align} }[/math]

and so:

[math]\displaystyle{ \begin{align} p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i})~\triangle. \end{align} }[/math]

problem 2

For a given variable [math]\displaystyle{ X_i }[/math] in a graphical model, what is the minimal set of nodes that renders [math]\displaystyle{ X_i }[/math] conditionally independent of all of the other variables? That is, what is the smallest set [math]\displaystyle{ C }[/math], such that [math]\displaystyle{ X_i\perp X_{V\setminus \{i\}\cup C}|X_C }[/math]? (Note that [math]\displaystyle{ V }[/math] is the set of all nodes in the graph, so that [math]\displaystyle{ V\setminus \{i\}\cup C }[/math] is the set of all nodes in the graph, excluding [math]\displaystyle{ C }[/math] and [math]\displaystyle{ i }[/math]).

  • Do the problem for an undirected graph.
  • Do the problem for a directed graph.

solution 1

a. As in directed graphs, the relationship between each two nodes, [math]\displaystyle{ X_i }[/math] and [math]\displaystyle{ X_j }[/math], can be of one of the child-parent or parent-child forms. Let's assume that we show the set of parents of the node [math]\displaystyle{ X_i }[/math] like this: [math]\displaystyle{ X_{\pi_i} }[/math]. So, a general node can be thought of as in Fig. 1. In this figure, all the possible relationships for the node [math]\displaystyle{ X_4 }[/math] and its immediate neighbours and those of the depth one has been shown.

File:ass1-2a.png
Fig.1 A directed graph!

From the graph in the figure we have:

[math]\displaystyle{ \begin{align} p(x) = p(X_1)p(X_2|X_1)p(X_3|X_2)p(X_4|X_2)p(X_6|X_4,X_5)p(X_7|X_6)\\ \end{align} }[/math]

So, we can derive:

[math]\displaystyle{ \begin{align} p(X_1,X_2,X_4)&=p(X_1)p(X_2|X_1)p(X_4|X_2)\sum_{X_3}p(X_3|X_2)\sum_{X_5}\sum_{X_6}p(X_6|X_4,X_5)\sum_{X_7}p(X_7|X_6)\\ &=p(X_1)p(X_2|X_1)p(X_4|X_2)\\ \rightarrow p(X_4|X_1,X_2) &= \frac{p(X_1,X_2,X_4)}{p(X_1,X_2)}=\frac{p(X_1)p(X_1,X_2)p(X_4,X_2)}{p(X_1,X_2)p(X_1)p(X_2)}=p(X_4|X_2)~\rightarrow~X_4 \perp X_1 | X_2. \end{align} }[/math]

Also we have:

[math]\displaystyle{ \begin{align} p(X_2,X_3,X_4)&=p(X_4|X_2)p(X_3|X_2)\sum_{X_1}p(X_1)p(X_2|X_1)\sum_{X_5}\sum_{X_6}p(X_6|X_4,X_5)\sum_{X_7}p(X_7|X_6)\\ &=p(X_4|X_2)p(X_3|X_2)p(X_2)\\ \rightarrow p(X_4,X_3|X_2) &= \frac{p(X_2,X_3,X_4)}{p(X_2)}=\frac{p(X_4|X_2)p(X_3|X_2)p(X_2)}{p(X_2)}=p(X_4|X_2)p(X_3|X_2)~\rightarrow~X_4 \perp X_3 | X_2. \end{align} }[/math]

So, by induction we can say that given the parents of a node, it will be conditionally independent of all of the nodes that come before. It is assumed that parents always come before children.

Now, if we check the independence of the nodes, from those there is a path to [math]\displaystyle{ X_4 }[/math] -in general- but not through parents of [math]\displaystyle{ X_4 }[/math], we will have:

[math]\displaystyle{ \begin{align} X_4 \perp X_7 | X_6,\\ X_4 \cancel{\perp} X_5 | X_6. \end{align} }[/math]

And if the children of [math]\displaystyle{ X_4 }[/math] are not given, we will have:

[math]\displaystyle{ \begin{align} X_4 \cancel{\perp} X_7,\\ X_4 \perp X_5. \end{align} }[/math]

So, as of the set [math]\displaystyle{ C }[/math], we can state:

[math]\displaystyle{ C_{X_i}=\pi_{X_i}\cup\{k,\pi_{X_k}|i\in \pi_{X_k}\} }[/math]


Problem 6

For the two graphs in Figure 1, determine which variables can be reached using the Bayes ball algorithm if we start at node A.

Fig.1 Bayes ball examples.


Solution 1

First graph: Starting from node A, the ball passes to nodes B & C. Node B blocks the ball, while node C passes the ball to nodes E & F. In turn, node E passes the ball to node G which passes the ball to nodes I & D. Node F passes the ball to node H which passes the ball to the already visited node, I. Therefore all nodes are visited. More formally, we can say that the variables reached using the Bayes ball algorithm are: A

B C E F G H D I


Second graph: Again, if we start from node A, it passes the ball to node G. As G receives the ball from a parent, it can only pass the ball to its children, hence G passes the ball to J. J is shaded, therefore it bounces the ball back to its parent G. This time G receives the ball from a child and can pass it everywhere and then the ball is passed to node D which passes the ball to nodes B & H. Node H doesn't have children and can't pass the ball anywhere, while node B passes the ball to node E which passes the ball to node I. Node I doesn't have the children and this is where the Bayes ball stops. We can now list the variables reached using the Bayes ball algorithm in the second graph as: A G J D B H E I