residual Component Analysis: Generalizing PCA for more flexible inference in linear-Gaussian models

From statwiki
Jump to navigation Jump to search

Introduction

Probabilistic principle component analysis (PPCA) decomposes the covariance of a data vector [math]\displaystyle{ y }[/math] in [math]\displaystyle{ \mathbb{R}^p }[/math], into a low-rank term and a spherical noise term.

[math]\displaystyle{ y \sim \mathcal{N} (0, WW^T+\sigma I ) }[/math]

[math]\displaystyle{ W \in \mathbb{R}^{p \times q} }[/math] such that [math]\displaystyle{ q \lt p-1 }[/math] imposes a reduced rank structure on the covariance. The log-likelihood of the centered dataset [math]\displaystyle{ Y }[/math] in [math]\displaystyle{ \mathbb{R}^{n \times p} }[/math] with n data points and p features can be maximized with the result

[math]\displaystyle{ W_{ML} = U_qL_qR^T }[/math]

where [math]\displaystyle{ U_q }[/math] are [math]\displaystyle{ q }[/math] principle eigenvectors of the sample covariance [math]\displaystyle{ \tilde S }[/math], with [math]\displaystyle{ \tilde S = n^{-1}Y^TY }[/math] and [math]\displaystyle{ L^q }[/math] is a diagonal matrix with elements [math]\displaystyle{ l_{i,i} = (\lambda_i - \sigma^2)^{1/2} }[/math], where [math]\displaystyle{ \lambda_i }[/math] is the ith eigenvalue of the sample covariance and [math]\displaystyle{ \sigma^2 }[/math] is the noise variance. This max-likelihood solution is rotation invariant; [math]\displaystyle{ R }[/math] is an arbitrary rotation matrix. The matrix [math]\displaystyle{ W }[/math] spans the principle subspace of the data and the model is known as probabilistic PCA.