measuring Statistical Dependence with Hilbert-Schmidt Norm: Difference between revisions
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
"Hilbert-Schmist Norm of the Cross-Covariance operator" is proposed as an independence criterion in reproducing kernel Hilbert spaces | |||
(RKHSs) | (RKHSs). The measure is refereed to as Hilbert-Schmidt Independence | ||
Criterion, or HSIC. | Criterion, or HSIC. An empirical estimate of this measure is introduced which may be well used in many practical application such as independent Component Analysis (ICA), Maximum Variance Unfolding (MVU), feature extraction, feature selection, ... . | ||
===RKHS Theory === | ===RKHS Theory === | ||
Let <math>\mathcal{F}</math> be a Hilbert space from <math>\mathcal{X}</math> to <math>\mathbb{R}</math>. We assume <math>\mathcal{F}</math> is a Reproducing Kernel Hilbert Space,i.e., for all <math>x\in \mathcal{X}</math>, the corresponding Dirac evaluation operator <math>\delta_x:\mathcal{F} \rightarrow \mathbb{R}</math> is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by <math>k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}}</math> where <math>k:\mathcal{X}\rightarrow \mathbb{R} </math> and <math>\phi </math> is the feature map of <math>\mathcal{F}</math>. Similarly, we consider another RKHS named <math>\mathcal{G}</math> with Domain <math>\mathcal{Y}</math>, kernel <math>l(\cdot,\cdot)</math> and feature map <math>\psi </math>. We assume both <math>\mathcal{F}</math> and <math>\mathcal{G}</math> are separable, i.e., they have a complete orthogonal basis. | Let <math>\mathcal{F}</math> be a Hilbert space from <math>\mathcal{X}</math> to <math>\mathbb{R}</math>. We assume <math>\mathcal{F}</math> is a Reproducing Kernel Hilbert Space,i.e., for all <math>x\in \mathcal{X}</math>, the corresponding Dirac evaluation operator <math>\delta_x:\mathcal{F} \rightarrow \mathbb{R}</math> is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by <math>k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}}</math> where <math>k:\mathcal{X}\times \mathcal{X}\rightarrow \mathbb{R} </math> is a positive definite function and <math>\phi </math> is the feature map of <math>\mathcal{F}</math>. Similarly, we consider another RKHS named <math>\mathcal{G}</math> with Domain <math>\mathcal{Y}</math>, kernel <math>l(\cdot,\cdot)</math> and feature map <math>\psi </math>. We assume both <math>\mathcal{F}</math> and <math>\mathcal{G}</math> are separable, i.e., they have a complete orthogonal basis. | ||
==Hilbert-Schmidt Norm == | ==Hilbert-Schmidt Norm == | ||
Line 18: | Line 18: | ||
==Bias of Estimator== | ==Bias of Estimator== | ||
===Large Deviation Bound=== | ===Large Deviation Bound=== | ||
==Deviation Bound for | ==Deviation Bound for U-statistics== | ||
==Bound on Empirical HSIC== | ==Bound on Empirical HSIC== | ||
===Independence Test using HSIC=== | ===Independence Test using HSIC=== | ||
===Experimental Results=== | ===Experimental Results=== |
Revision as of 11:02, 24 June 2009
"Hilbert-Schmist Norm of the Cross-Covariance operator" is proposed as an independence criterion in reproducing kernel Hilbert spaces (RKHSs). The measure is refereed to as Hilbert-Schmidt Independence Criterion, or HSIC. An empirical estimate of this measure is introduced which may be well used in many practical application such as independent Component Analysis (ICA), Maximum Variance Unfolding (MVU), feature extraction, feature selection, ... .
RKHS Theory
Let [math]\displaystyle{ \mathcal{F} }[/math] be a Hilbert space from [math]\displaystyle{ \mathcal{X} }[/math] to [math]\displaystyle{ \mathbb{R} }[/math]. We assume [math]\displaystyle{ \mathcal{F} }[/math] is a Reproducing Kernel Hilbert Space,i.e., for all [math]\displaystyle{ x\in \mathcal{X} }[/math], the corresponding Dirac evaluation operator [math]\displaystyle{ \delta_x:\mathcal{F} \rightarrow \mathbb{R} }[/math] is a bounded (or equivalently continuous) linear functional. We denote the kernel of this operator by [math]\displaystyle{ k(x,x')=\langle \phi(x)\phi(x') \rangle_{\mathcal{F}} }[/math] where [math]\displaystyle{ k:\mathcal{X}\times \mathcal{X}\rightarrow \mathbb{R} }[/math] is a positive definite function and [math]\displaystyle{ \phi }[/math] is the feature map of [math]\displaystyle{ \mathcal{F} }[/math]. Similarly, we consider another RKHS named [math]\displaystyle{ \mathcal{G} }[/math] with Domain [math]\displaystyle{ \mathcal{Y} }[/math], kernel [math]\displaystyle{ l(\cdot,\cdot) }[/math] and feature map [math]\displaystyle{ \psi }[/math]. We assume both [math]\displaystyle{ \mathcal{F} }[/math] and [math]\displaystyle{ \mathcal{G} }[/math] are separable, i.e., they have a complete orthogonal basis.