kernelized Sorting: Difference between revisions
Line 33: | Line 33: | ||
let <math>\mathcal {F}</math> be the Reproducing Kernel Hilbert Space (RKHS) on | let <math>\mathcal {F}</math> be the Reproducing Kernel Hilbert Space (RKHS) on | ||
$\mathcal X$ with associated kernel <math>k: \mathcal X \times \mathcal X \rightarrow | $\mathcal {X}$ with associated kernel <math>k: \mathcal X \times \mathcal X \rightarrow | ||
\mathbb{R}</math> and feature map <math>\phi: \mathcal X \rightarrow \mathcal {F}</math>. | \mathbb{R}</math> and feature map <math>\phi: \mathcal X \rightarrow \mathcal {F}</math>. | ||
Let <math>\mathcal {G}</math> be the RKHS on <math>\mathcal Y</math> with kernel <math>l</math> and | Let <math>\mathcal {G}</math> be the RKHS on <math>\mathcal Y</math> with kernel <math>l</math> and | ||
Line 41: | Line 41: | ||
<math> | <math> | ||
C_{xy}=\mathbb{E}_{xy}[(\phi(x)-\mu_{x})\otimes (\psi(y)-\mu_{y})], | C_{xy}=\mathbb{E}_{xy}[(\phi(x)-\mu_{x})\otimes (\psi(y)-\mu_{y})], | ||
</math> | |||
where | |||
</math>. | where <math>\mu_{x}=\mathbb{E}[\phi(x)]$, $\mu_{y}=\mathbb{E}[\psi(y)]</math>. |
Revision as of 11:48, 12 July 2009
Object matching is a fundamental operation in data analysis. It typically requires the definition of a similarity measure between the classes of objects to be matched. Instead, we develop an approach which is able to perform matching by requiring a similarity measure only within each of the classes. This is achieved by maximizing the dependency between matched pairs of observations by means of the Hilbert Schmidt Independence Criterion. This problem can be cast as one of maximizing a quadratic assignment problem with special structure and we present a simple algorithm for finding a locally optimal solution.
Introduction
Problem Statement
Assume we are given two collections of documents purportedly covering the same content, written in two different languages. Can we determine the correspondence between these two sets of documents without using a dictionary?
Sorting and Matching
(Formal) problem formulation:
Given two sets of observations [math]\displaystyle{ X= \{ x_{1},..., x_{m} \} }[/math] and [math]\displaystyle{ Y=\{ y_{1},..., y_{m}\} }[/math]
Find a permutation matrix [math]\displaystyle{ \pi \in \Pi_{m} }[/math],
[math]\displaystyle{ \Pi_{m}:= \{ \pi | \pi \in \{0,1\}^{m \times m} where \pi 1_{m}=1_{m}, \pi^{T}1_{m}=1_{m}\} }[/math]
such that [math]\displaystyle{ \{ (x_{i},y_{\pi (i)}) for 1 \leqslant i \leqslant m \} }[/math] is maximally dependent. Here [math]\displaystyle{ 1_{m} \in \mathbb{R}^{m} }[/math] is the vector of all ones.
Denote by [math]\displaystyle{ D(Z(\pi)) }[/math] a measure of the dependence between x and y. Then we define nonparametric sorting of X and Y as follows
[math]\displaystyle{ \pi^{\ast}:=argmax_{\pi \in \prod_{m}}D(Z(\pi)). }[/math]
Hilbert Schmidt Independence Criterion
Let sets of observations X and Y be drawn jointly from some probability distribution [math]\displaystyle{ Pr_{xy} }[/math]. The Hilbert Schmidt Independence Criterion (HSIC) measures the dependence between x and y by computing the norm of the cross-covariance operator over the domain [math]\displaystyle{ \mathcal X \times \mathcal Y }[/math] in Hilbert Space.
let [math]\displaystyle{ \mathcal {F} }[/math] be the Reproducing Kernel Hilbert Space (RKHS) on $\mathcal {X}$ with associated kernel [math]\displaystyle{ k: \mathcal X \times \mathcal X \rightarrow \mathbb{R} }[/math] and feature map [math]\displaystyle{ \phi: \mathcal X \rightarrow \mathcal {F} }[/math]. Let [math]\displaystyle{ \mathcal {G} }[/math] be the RKHS on [math]\displaystyle{ \mathcal Y }[/math] with kernel [math]\displaystyle{ l }[/math] and feature map [math]\displaystyle{ \psi }[/math]. The cross-covariance operator [math]\displaystyle{ C_{xy}:\mathcal {G}\rightarrow \mathcal {F} }[/math] is defined by
[math]\displaystyle{ C_{xy}=\mathbb{E}_{xy}[(\phi(x)-\mu_{x})\otimes (\psi(y)-\mu_{y})], }[/math]
where [math]\displaystyle{ \mu_{x}=\mathbb{E}[\phi(x)]$, $\mu_{y}=\mathbb{E}[\psi(y)] }[/math].