nonparametric Latent Feature Models for Link Prediction: Difference between revisions
Line 39: | Line 39: | ||
===Generalizations=== | ===Generalizations=== | ||
The model can be easily | The model can be easily generalized for multiple relations instead of a single relation. The latent features keep the same, but an independent weight matrix <math>W^i</math> is used for each relation <math>Y^i</math>. Covariates may be relation specific or common across all relations. By taking the weight matrix to be symmetric, the model can deal with undirected networks. | ||
==Inference== | ==Inference== |
Revision as of 08:50, 8 August 2013
Introduction
The goal of this paper <ref>Kurt T. Miller, Thomas L. Griffiths, and Michael I. Jordan. Nonparametric latent feature models for link prediction. NIPS, 2009</ref>is link prediction for a partially observed network, i.e. we observe the links (1 or 0) between some pairs of the nodes in a network and we try to predict the unobserved links. Basically, it builds the model by extracting the latent structure that representing the properties of individual entities. Unlike the latent space model <ref>Peter D. Hoff, Adrian E. Raftery, and Mark S. Handcock. Latent space approaches to social network analysis. JASA, 97(460):1090-1098.</ref>, which tries to find the location (arbitrary real valued features) of each node in a latent space, the "latent feature" here mainly refers to a class-based (binary features) representation. They assume a finite number of classes that entities can belong to and the interactions between classes determine the structure of the network. More specifically, the probability of forming a link depends only on the classes of the corresponding pair of nodes. The idea is fairly similar to the stochastic blockmodel <ref>Krzysztof Nowicki and Tom A. B. Snijders. Estimation and prediction for stochastic blockstructures. JASA, 96(455):1077-1087, 2001. </ref> <ref>Edoardo M. Airoldi, David M. Blei, Exic P. Xing, and Stephen E. Fienberg. Mixed membership stochastic block models. In Advances in Neural Information Processing Systems. 2009.</ref>. However, the blockmodels are mainly for community detection/network clustering, but not link prediction. This paper fills in the gap.
The nonparametric latent feature relational model is a Bayesian nonparametric model in which each node has binary-valued latent features that influences its relation to other nodes. Known covariates information can also be incorporated. The model can simultaneously infer the number (dimension) of latent features, the values of the features for each node and how the features influence the links.
The nonparametric latent feature relational model
Directed network is considered here. Let [math]\displaystyle{ Y }[/math] be the [math]\displaystyle{ N\times N }[/math] binary adjacency matrix of a network. The component [math]\displaystyle{ y_{ij}=1 }[/math] if there is a link from node [math]\displaystyle{ i }[/math] to node [math]\displaystyle{ j }[/math] and [math]\displaystyle{ y_{ij}=0 }[/math] if there is no link. The components corresponding to unobserved links are left unfilled. The goal is to learn from the observed links so that we can predict the unfilled entries.
Basic model
Let [math]\displaystyle{ Z }[/math] denote the latent features, where [math]\displaystyle{ Z }[/math] is a [math]\displaystyle{ N\times K }[/math] binary matrix. Each row of [math]\displaystyle{ Z }[/math] corresponds to a node and each column correspond to a latent feature such that [math]\displaystyle{ z_{ij}=1 }[/math] if the [math]\displaystyle{ i^{th} }[/math] node has feature [math]\displaystyle{ k }[/math] and 0, otherwise. And let [math]\displaystyle{ Z_i }[/math] denote the [math]\displaystyle{ i^{th} }[/math] row of [math]\displaystyle{ Z }[/math] (the feature vector corresponding to node i). Let W be a [math]\displaystyle{ K\times K }[/math] real-valued weight matrix where [math]\displaystyle{ w_{kk^\prime} }[/math] is the weight that affects the probability of a link when the corresponding nodes have features [math]\displaystyle{ k }[/math] and [math]\displaystyle{ k^' }[/math], respectively. By assuming the link probabilities are conditional independent give the latent features and the weights, the likelihood function can be written as:
[math]\displaystyle{ Pr(Y|Z, W)=\prod_{i,j}Pr(y_{ij}|Z_i,Z_j, W):=\sigma(Z_i W Z_j^{\top})=\sigma(\sum_{k,k^'}z_{ik}z_{jk^'}w_{kk^'}) }[/math]
where [math]\displaystyle{ \sigma(.) }[/math] is a function that transforms values from [math]\displaystyle{ (-\infty,\infty) }[/math] to [math]\displaystyle{ (0,1) }[/math].
Prior distributions are assumed for the latent features and the weight matrix, where Z is generated by Indian Buffet Process <ref>Thomas L. Griffiths and Zoubin Ghahramani. Infinite latent feature models and the Indian Buffet Process. In Advances in Neural Information Processing Systems, 2007</ref> and the component of W has normal prior, i.e.
[math]\displaystyle{ Z \sim IBP(\alpha) }[/math]
[math]\displaystyle{ w_{kk^'}\sim \mathcal{N}(0, \sigma_w^2) }[/math]
Full model
The covariates information of each node can be incorporate into the model. The full nonparametric latent feature relational model is
[math]\displaystyle{ Pr(y_{ij}=1|Z, W, X, \beta, a, b, c)=\sigma(Z_i W Z_j^\top +\beta^\top X_{ij}+(\beta_p^\top X_{p,i}+a_i)+(\beta_c^\top X_{c,i}+b_i)+c) }[/math]
where [math]\displaystyle{ X_{p,i},X_{c,j} }[/math] are known covariate vector when node i and j are link parent and child, respectively; [math]\displaystyle{ X_{ij} }[/math] is a vector of interaction effects; [math]\displaystyle{ \beta, \beta_p, \beta_c, a and b }[/math] are coefficients and offsets which all assumed to be normally distributed. We drop the corresponding terms if no information available.
Generalizations
The model can be easily generalized for multiple relations instead of a single relation. The latent features keep the same, but an independent weight matrix [math]\displaystyle{ W^i }[/math] is used for each relation [math]\displaystyle{ Y^i }[/math]. Covariates may be relation specific or common across all relations. By taking the weight matrix to be symmetric, the model can deal with undirected networks.
Inference
Simulations and real data results
Conclusion
References
<references/>