graphical models for structured classification, with an application to interpreting images of protein subcellular location patterns

From statwiki
Revision as of 01:17, 11 November 2011 by Tameem (talk | contribs) (Background)
Jump to: navigation, search


In structured classification problems, there is a direct conflict between expressive models and efficient inference: while graphical models such as factor graphs can represent arbitrary dependences among instance labels, the cost of inference via belief propagation in these models grows rapidly as the graph structure becomes more complicated. One important source of complexity in belief propagation is the need to marginalize large factors to compute messages. This operation takes time exponential in the number of variables in the factor, and can limit the expressiveness of the models used. A new class of potential functions is proposed, which is called decomposable k-way potentials. It provides efficient algorithms for computing messages from these potentials during belief propagation. These new potentials provide a good balance between expressive power and efficient inference in practical structured classification problems. Three instances of decomposable potentials are discussed: the associative Markov network potential, the nested junction tree, and the voting potential. The new representation and algorithm lead to substantial improvements in both inference speed and classification accuracy.

Belief Propagation

Just to stick with the notion of the authors, let's assume that [math]\phi_i^{loc}(x_i)[/math] is the one-argument factor that represents the local evidence on [math]x_i[/math]. Moreover, Figure 1 shows the notion they use in graphs. The small squares denote potential functions, and, as usual, the shaded and unshaded circles represent observed and unobserved variables respectively.

File:sum fig1.JPG
Fig.1 A probability distribution represented as a factor graph

Using such notion, the message sent from a variable [math]x_i[/math] to a potential function [math]\phi_k[/math] as:

[math]m_{i \rightarrow k}(x_i)=\phi_i^{loc}(x_i)\prod_{j=1}^{k-1}m_{j \rightarrow i}(x_i) (1)[/math]

Similarly, a message from a potential function [math]\phi_j[/math] to [math]x_k[/math] can be computed as:

[math]m_{j \rightarrow k}(x_k)=\sum_{x_1}\sum_{x_2}...\sum_{x_{k-1}}\phi_j(x_1,...,x_k)\prod_{i=1}^{k-1}m_{i \rightarrow j}(x_i) (2)[/math]