measuring Statistical Dependence with Hilbert-Schmidt Norm: Difference between revisions
No edit summary |
|||
Line 3: | Line 3: | ||
Criterion, or HSIC. This criterion can be used as dependence measure in practical application such as independent Component Analysis (ICA), Maximum Variance Unfolding (MVU), feature extraction, feature selection, ... . | Criterion, or HSIC. This criterion can be used as dependence measure in practical application such as independent Component Analysis (ICA), Maximum Variance Unfolding (MVU), feature extraction, feature selection, ... . | ||
===RKHS Theory === | ===RKHS Theory === | ||
Let <math>\mathcal{F}</math> be a Hilbert space from <math>\mathcal{X}</math> to <math>\mathbb{R}</math>. We assume <math>\mathcal{F}</math> is a Reproducing Kernel Hilbert Space,i.e., for all <math>x\in X</math>, the corresponding Dirac evaluation operator <math>\delta_x:\mathcal{F} \rightarrow \mathbb{R}</math> is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by <math>k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}}</math> where <math>k:\mathcal{X}\rightarrow \mathbb{R} </math> and <math>\phi</math> is the feature map of <math>\mathcal{F}</math>. Similarly, we consider another RKHS named <math>\mathcal{G}</math> with Domain <math>\mathcal{Y}</math>, kernel <math>l(\cdot,\cdot)</math> and feature map <math>\psi</math>. We assume both <math>\mathcal{F}</math> and <math>\mathcal{G}</math> are separable, i.e., they have a complete orthogonal basis. | Let <math>\mathcal{F}</math> be a Hilbert space from <math>\mathcal{X}</math> to <math>\mathbb{R}</math>. We assume <math>\mathcal{F}</math> is a Reproducing Kernel Hilbert Space,i.e., for all <math>x\in \mathcal{X}</math>, the corresponding Dirac evaluation operator <math>\delta_x:\mathcal{F} \rightarrow \mathbb{R}</math> is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by <math>k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}}</math> where <math>k:\mathcal{X}\rightarrow \mathbb{R} </math> and <math>\phi</math> is the feature map of <math>\mathcal{F}</math>. Similarly, we consider another RKHS named <math>\mathcal{G}</math> with Domain <math>\mathcal{Y}</math>, kernel <math>l(\cdot,\cdot)</math> and feature map <math>\psi</math>. We assume both <math>\mathcal{F}</math> and <math>\mathcal{G}</math> are separable, i.e., they have a complete orthogonal basis. | ||
==Hilbert-Schmidt Norm == | ==Hilbert-Schmidt Norm == | ||
==Hilbert-Schmidt Operator == | ==Hilbert-Schmidt Operator == |
Revision as of 10:45, 24 June 2009
An independence criterion based on covariance operators in reproducing kernel Hilbert spaces (RKHSs) is proposed. It Also, an empirical estimate of this measure is given which is refereed to as Hilbert-Schmidt Independence Criterion, or HSIC. This criterion can be used as dependence measure in practical application such as independent Component Analysis (ICA), Maximum Variance Unfolding (MVU), feature extraction, feature selection, ... .
RKHS Theory
Let [math]\displaystyle{ \mathcal{F} }[/math] be a Hilbert space from [math]\displaystyle{ \mathcal{X} }[/math] to [math]\displaystyle{ \mathbb{R} }[/math]. We assume [math]\displaystyle{ \mathcal{F} }[/math] is a Reproducing Kernel Hilbert Space,i.e., for all [math]\displaystyle{ x\in \mathcal{X} }[/math], the corresponding Dirac evaluation operator [math]\displaystyle{ \delta_x:\mathcal{F} \rightarrow \mathbb{R} }[/math] is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by [math]\displaystyle{ k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}} }[/math] where [math]\displaystyle{ k:\mathcal{X}\rightarrow \mathbb{R} }[/math] and [math]\displaystyle{ \phi }[/math] is the feature map of [math]\displaystyle{ \mathcal{F} }[/math]. Similarly, we consider another RKHS named [math]\displaystyle{ \mathcal{G} }[/math] with Domain [math]\displaystyle{ \mathcal{Y} }[/math], kernel [math]\displaystyle{ l(\cdot,\cdot) }[/math] and feature map [math]\displaystyle{ \psi }[/math]. We assume both [math]\displaystyle{ \mathcal{F} }[/math] and [math]\displaystyle{ \mathcal{G} }[/math] are separable, i.e., they have a complete orthogonal basis.