## Optimal Solutions for Sparse Principal Component Analysis

** Alexandre d'Aspremont, Francis Bach, Laurent El Ghaoui**; 9(Jul):1269--1294, 2008.

### Abstract

Given a sample covariance matrix, we examine the problem of maximizing the variance explained by a linear combination of the input variables while constraining the number of nonzero coefficients in this combination. This is known as sparse principal component analysis and has a wide array of applications in machine learning and engineering. We formulate a new semidefinite relaxation to this problem and derive a greedy algorithm that computes a*full set*of good solutions for all target numbers of non zero coefficients, with total complexity

*O*(

*n*

^{3}), where

*n*is the number of variables. We then use the same relaxation to derive sufficient conditions for global optimality of a solution, which can be tested in

*O*(

*n*

^{3}), per pattern. We discuss applications in subset selection and sparse recovery and show on artificial examples and biological data that our algorithm does provide globally optimal solutions in many cases.

[abs][pdf]