## When Is There a Representer Theorem? Vector Versus Matrix Regularizers

** Andreas Argyriou, Charles A. Micchelli, Massimiliano Pontil**; 10(87):2507−2529, 2009.

### Abstract

We consider a general class of regularization methods which
learn a vector of parameters on the basis of linear measurements. It
is well known that if the regularizer is a nondecreasing function of
the *L*_{2} norm, then the learned vector is a linear combination of
the input data. This result, known as the *representer theorem*, lies at
the basis of kernel-based methods in machine learning. In this paper,
we prove the necessity of the above condition, in the case of differentiable regularizers.
We further extend our analysis to regularization methods which learn a matrix, a
problem which is motivated by the application to multi-task
learning. In this context, we study a more general representer
theorem, which holds for a larger class of regularizers. We provide a
necessary and sufficient condition characterizing this class of matrix
regularizers and we highlight some concrete examples of
practical importance. Our analysis uses basic principles from matrix
theory, especially the useful notion of matrix nondecreasing functions.

© JMLR 2009. (edit, beta) |