## Distance Metric Learning for Large Margin Nearest Neighbor Classification

** Kilian Q. Weinberger, Lawrence K. Saul**; 10(9):207−244, 2009.

### Abstract

The accuracy of *k*-nearest neighbor (kNN) classification depends
significantly on the metric used to compute distances between
different examples. In this paper, we show how to learn a Mahalanobis
distance metric for kNN classification from labeled examples. The
Mahalanobis metric can equivalently be viewed as a global linear
transformation of the input space that precedes kNN classification
using Euclidean distances. In our approach, the metric is trained
with the goal that the *k*-nearest neighbors always belong to the same
class while examples from different classes are separated by a large
margin. As in support vector machines (SVMs), the margin criterion
leads to a convex optimization based on the hinge loss. Unlike
learning in SVMs, however, our approach requires no modification or
extension for problems in multiway (as opposed to binary)
classification. In our framework, the Mahalanobis distance metric is
obtained as the solution to a semidefinite program. On several data
sets of varying size and difficulty, we find that metrics trained in
this way lead to significant improvements in kNN classification.
Sometimes these results can be further improved by clustering the
training examples and learning an individual metric within each
cluster. We show how to learn and combine these local metrics in a
globally integrated manner.

© JMLR 2009. (edit, beta) |