f11Stat946ass
problem 1
Recall the definition of the family of probability distributions associated with a directed graphical model:
Prove by induction that [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math].
solution 1
According to the product rule we have:
This is the most general case for a directed graph, as we can represent each and every graphical model with a fully connected graph.
Now, we need to show that this statement, [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math], holds for [math]\displaystyle{ n=1 }[/math] (and [math]\displaystyle{ n=2 }[/math], just for the sake of more clarity).
Where in the latter case, without loss of generality, we have assumed that [math]\displaystyle{ x_1 }[/math] is the only parent of [math]\displaystyle{ x_2 }[/math]. This assumption that we've taken, not only does not reduce from the generality of the proof (as we have initially assumed that the graph is directed and in a directed graph one of two connected nodes/random variables should be a parent of the other), but it also abides by our earlier assumption, that parents come before children. This leaves no parents for [math]\displaystyle{ x_1 }[/math], similar to the former case in the above.
Now, assuming that the statement [math]\displaystyle{ p(x_i|x_{\pi_i})=f_i(x_i,x_{\pi_i}) }[/math] holds for [math]\displaystyle{ n=k }[/math],
we need to show that it also holds for [math]\displaystyle{ n=k+1 }[/math].
On the other hand according to [math]\displaystyle{ \sharp }[/math] we have:
[math]\displaystyle{ \begin{align} p(x_1,\ldots,x_{k+1})=p(x_1,\ldots,x_{k})p(x_{k+1}|x_1,\ldots,x_{k}) \end{align} }[/math]
Thus,
Once more using [math]\displaystyle{ \flat }[/math], we have:
and so:
problem 2
For a given variable [math]\displaystyle{ X_i }[/math] in a graphical model, what is the minimal set of nodes that renders [math]\displaystyle{ X_i }[/math] conditionally independent of all of the other variables? That is, what is the smallest set [math]\displaystyle{ C }[/math], such that [math]\displaystyle{ X_i\perp X_{V\setminus \{i\}\cup C}|X_C }[/math]? (Note that [math]\displaystyle{ V }[/math] is the set of all nodes in the graph, so that [math]\displaystyle{ V\setminus \{i\}\cup C }[/math] is the set of all nodes in the graph, excluding [math]\displaystyle{ C }[/math] and [math]\displaystyle{ i }[/math]).
- Do the problem for an undirected graph.
- Do the problem for a directed graph.
solution 1
a. As in directed graphs, the relationship between each two nodes, [math]\displaystyle{ X_i }[/math] and [math]\displaystyle{ X_j }[/math], can be of one of the child-parent or parent-child forms. Let's assume that we show the set of parents of the node [math]\displaystyle{ X_i }[/math] like this: [math]\displaystyle{ X_{\pi_i} }[/math]. So, a general node can be thought of as in Fig. 1. In this figure, all the possible relationships for the node [math]\displaystyle{ X_4 }[/math] and its immediate neighbours and those of the depth one has been shown.
From the graph in the figure we have:
So, we can derive:
Also we have:
So, by induction we can say that given the parents of a node, it will be conditionally independent of all of the nodes that come before. It is assumed that parents always come before children.
Now, if we check the independence of the nodes, from those there is a path to [math]\displaystyle{ X_4 }[/math] -in general- but not through parents of [math]\displaystyle{ X_4 }[/math], we will have:
And if the children of [math]\displaystyle{ X_4 }[/math] are not given, we will have:
So, as of the set [math]\displaystyle{ C }[/math], we can state:
b. As of an undirected graph, we will have:
problem 3
Consider the directed graphical model below.
(a) Moralize the graph, and show the resulting undirected graphical model.
(b) Invoke UndirectedGraphEliminate on your moral graph, using the elimination ordering (7, 8, 9, 6, 3, 5, 4, 2, 1), and show the resulting reconstituted graph (the graph that includes all the edges added during the elimination process). You do not have to show the intermediate steps of the algorithm.
(c) Report (b), but using the elimination ordering (7, 6, 8, 9, 4, 3, 2, 5, 1).
(d) Which of the two elimination orderings, if used in Eliminate to calculate [math]\displaystyle{ p(x_1|x_7) }[/math], do you think would result in a more time-efficient calculation? Which would result in a more space-efficient calculation? Briefly explain why.
solution 1
(a) Here is the resulting moralized undirected graph (Fig. 3).
(b) Here is the resulting reconstituted graph, after a (7, 8, 9, 6, 3, 5, 4, 2, 1) order elimination. (Fig. 4)
(c) Here is the resulting reconstituted graph, after a (7, 6, 8, 9, 4, 3, 2, 5, 1) order elimination. (Fig. 5)
(d) It is more computationally efficient to remove those nodes, which have less number of edges, as it takes less number of summations over their neighbours. So, starting with single-edge nodes would be a good idea, which makes the first ordering more computationally efficient. (What does space-efficient means, though?)
Problem 6
For the two graphs in Figure 1, determine which variables can be reached using the Bayes ball algorithm if we start at node A.
Solution 1
First graph: Starting from node A, the ball passes to nodes B & C. Node B blocks the ball, while node C passes the ball to nodes E & F. In turn, node E passes the ball to node G which passes the ball to nodes I & D. Node F passes the ball to node H which passes the ball to the already visited node, I. Therefore all nodes are visited.
More formally, we can say that the variables reached using the Bayes ball algorithm are:
A
B
C
E
F
G
H
D
I
Second graph:
Again, if we start from node A, it passes the ball to node G. As G receives the ball from a parent, it can only pass the ball to its children, hence G passes the ball to J. J is shaded, therefore it bounces the ball back to its parent G. This time G receives the ball from a child and can pass it everywhere and then the ball is passed to node D which passes the ball to nodes B & H. Node H doesn't have children and can't pass the ball anywhere, while node B passes the ball to node E which passes the ball to node I. Node I doesn't have the children and this is where the Bayes ball stops.
We can now list the variables reached using the Bayes ball algorithm in the second graph as:
A
G
J
D
B
H
E
I