# hamming Distance Metric Learning

$b(x,w)=sign(f(w,x))$
In a previous paper, the authors tried to used a loss function which bears some similarity to the hinge function used in SVM. It includes a hyper-parameter which is a threshold in Hamming space that differentiates neighbors from non-neighbors. such that similar points are mapped to binary codes that do differ in more than P bits and disimilar points should map to points closer no more than P bits. For two binary codes $h$ and $g$ with hamming distance $||h-g||_H$ and a similarity label $s \in {0,1}$ the pairwise hinge loss function is defined as:
However in practice finding value of P is not easy. Moreover in some datasets the relative pairwise distance is important not the precise numerical value. As a results in this paper authors define the loss function in terms of the relative similarity. To define relative similarity it is assumed that dataset include triplet of items $(x,x^+,x^-)$ such that $x$ is more similar to $x^+$ than $x^-$. With this assumption the ranking loss on triplet of binary codes $(h,h^+,h^-)$ is:
$l_{triple}(h,h^+,h^-)=[||h-h^+||-||h-h^-||+1]_+$