nonparametric Latent Feature Models for Link Prediction

From statwiki
Jump to navigation Jump to search

Introduction

The goal of this paper <ref>Kurt T. Miller, Thomas L. Griffiths, and Michael I. Jordan. Nonparametric latent feature models for link prediction. NIPS, 2009</ref>is link prediction for a partially observed network, i.e. we observe the links (1 or 0) between some pairs of the nodes in a network and we try to predict the unobserved links. Basically, it builds the model by extracting the latent structure that representing the properties of individual entities. Unlike the latent space model <ref>Peter D. Hoff, Adrian E. Raftery, and Mark S. Handcock. Latent space approaches to social network analysis. JASA, 97(460):1090-1098.</ref>, which tries to find the location (arbitrary real valued features) of each node in a latent space, the "latent feature" here mainly refers to a class-based (binary features) representation. They assume a finite number of classes that entities can belong to and the interactions between classes determine the structure of the network. More specifically, the probability of forming a link depends only on the classes of the corresponding pair of nodes. The idea is fairly similar to the stochastic blockmodel <ref>Krzysztof Nowicki and Tom A. B. Snijders. Estimation and prediction for stochastic blockstructures. JASA, 96(455):1077-1087, 2001. </ref> <ref>Edoardo M. Airoldi, David M. Blei, Exic P. Xing, and Stephen E. Fienberg. Mixed membership stochastic block models. In Advances in Neural Information Processing Systems. 2009.</ref>. However, the blockmodels are mainly for community detection/network clustering, but not link prediction. This paper fills in the gap.

The nonparametric latent feature relational model is a Bayesian nonparametric model in which each node has binary-valued latent features that influences its relation to other nodes. Known covariates information can also be incorporated. The model can simultaneously infer the number (dimension) of latent features, the values of the features for each node and how the features influence the links.

The nonparametric latent feature relational model

Directed network is considered here. Let [math]\displaystyle{ Y }[/math] be the [math]\displaystyle{ N\times N }[/math] binary adjacency matrix of a network. The component [math]\displaystyle{ y_{ij}=1 }[/math] if there is a link from node [math]\displaystyle{ i }[/math] to node [math]\displaystyle{ j }[/math] and [math]\displaystyle{ y_{ij}=0 }[/math] if there is no link. The components corresponding to unobserved links are left unfilled. The goal is to learn from the observed links so that we can predict the unfilled entries.

Basic model

Let [math]\displaystyle{ Z }[/math] denote the latent features, where [math]\displaystyle{ Z }[/math] is a [math]\displaystyle{ N\times K }[/math] binary matrix. Each row of [math]\displaystyle{ Z }[/math] corresponds to a node and each column correspond to a latent feature such that [math]\displaystyle{ z_{ij}=1 }[/math] if the [math]\displaystyle{ i^{th} }[/math] node has feature [math]\displaystyle{ k }[/math] and 0, otherwise. And let [math]\displaystyle{ Z_i }[/math] denote the [math]\displaystyle{ i^{th} }[/math] row of [math]\displaystyle{ Z }[/math] (the feature vector corresponding to node i). Let W be a [math]\displaystyle{ K\times K }[/math] real-valued weight matrix where [math]\displaystyle{ w_{kk^\prime} }[/math] is the weight that affects the probability of a link when the corresponding nodes have features [math]\displaystyle{ k }[/math] and [math]\displaystyle{ k^' }[/math], respectively. By assuming the link probabilities are conditional independent give the latent features and the weights, the likelihood function can be written as:

[math]\displaystyle{ Pr(Y|Z, W)=\prod_{i,j}Pr(y_{ij}|Z_i,Z_j, W):=\sigma(Z_i W Z_j^{\top})=\sigma(\sum_{k,k^'}z_{ik}z_{jk^'}w_{kk^'}) }[/math]

where [math]\displaystyle{ \sigma(.) }[/math] is a function that transforms values from [math]\displaystyle{ (-\infty,\infty) }[/math] to [math]\displaystyle{ (0,1) }[/math].

Inference

Simulations and real data results

Conclusion

References

<references/>