positive Semidefinite Metric Learning Using Boosting-like Algorithms

From statwiki
Revision as of 18:19, 3 July 2013 by F33li (talk | contribs) (Created page with "A Mahlanobis distance metric learning method is proposed in the paper. The training samples are made into “triplets” (a<sub>i</sub>, a<sub>j</sub>, a<sub>k</sub>), where a<su...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A Mahlanobis distance metric learning method is proposed in the paper. The training samples are made into “triplets” (ai, aj, ak), where ai and aj are in the same class, but ai and ak are in different classes. The goal is to find a positive semi-definite matrix M so that the difference of dist(i,j) and dist(i,k) for all the triplets can be as large as possible in the new transformation space. This can be formulated into a semi-definite programming (SDP) problem. The authors propose to replace M by a convex combination of one-rank p.s.d. matrices. Instead of solving these matrices, they assume that the matrices are known. However, the possibility of such matrices is infinite, and thus the number of constraints in the dual problem would be too many to solve. So they use column generation to find the most violated constraint in the dual problem iteratively and add the constraint into the optimization framework. The weight for the corresponding one-rank matrix is calculated. This is similar to the classification method “boosting”, which has the power of turning many weak classifiers to a strong classifier. Here the one-rank matrices can be considered as the weak classifiers. When no more constraint can be added, the final matrix to be learned is the weighted combination of the one-rank matrices added. Three types of objective functions are proposed in the paper: hinge loss, exponential loss and logistic loss. In each iteration, a new base (one-rank matrix) is computed using eigenvalue-decomposition. There are two strategies to update weights. One is called “stage-wise boosting”, in which the weight for the current base is computed based on the previous weights which are fixed. The bisection method is used to search the optimal weight. The other strategy is called “totally corrective boosting”, in which the weights are optimized in each iteration. The test results show that this method is faster than the LMNN algorithm as the input dimensions increase, while the test error is even slight lower than LMNN.