## Optimal Solutions for Sparse Principal Component Analysis

** Alexandre d'Aspremont, Francis Bach, Laurent El Ghaoui**; 9(42):1269−1294, 2008.

### Abstract

Given a sample covariance matrix, we examine the problem of maximizing
the variance explained by a linear combination of the input variables
while constraining the number of nonzero coefficients in this
combination. This is known as sparse principal component analysis and
has a wide array of applications in machine learning and
engineering. We formulate a new semidefinite relaxation to this
problem and derive a greedy algorithm that computes a *full set*
of good solutions for all target numbers of non zero coefficients,
with total complexity *O*(*n*^{3}), where *n*
is the number of variables. We then use the same relaxation to derive
sufficient conditions for global optimality of a solution, which can
be tested in *O*(*n*^{3}), per pattern. We discuss
applications in subset selection and sparse recovery and show on
artificial examples and biological data that our algorithm does
provide globally optimal solutions in many cases.

© JMLR 2008. (edit, beta) |