next up previous
Next: Termination Criterion Up: The Decomposition Algorithm Previous: Solving the Subproblem

Shrinking

The idea of shrinking is to remove some variables whose values have been equal to the bounds 0 or C for a long time, and that will probably not change anymore. To do this, we use the fact that $(\boldsymbol{\alpha},\, \boldsymbol{\alpha^{\star}})$ minimizes the problem (3) under the constraints (4) and (5) if and only if there exists numbers $\boldsymbol{\lambda}^{up} \in \mathbb{R} ^{2l}$, $\boldsymbol{\lambda}^{low} \in \mathbb{R} ^{2l}$, $\lambda^{eq} \in \mathbb{R} $ that verify the following KKT conditions:

 \begin{displaymath}\mathcal{W}^{'}(\boldsymbol{\alpha},\, \boldsymbol{\alpha}^{\...
...ol{\lambda}^{low} + \boldsymbol{\lambda}^{up} = \boldsymbol{0}
\end{displaymath} (24)


\begin{displaymath}\lambda^{low}_i \, \alpha_i = 0, \ \ i = 1\ldots l \ \ \ \tex...
...da^{low}_i \, \alpha^{\star}_{i-l} = 0, \ \ i = (l+1)\ldots 2l
\end{displaymath} (25)


\begin{displaymath}\lambda^{up}_i \left( \alpha_i - C \right) = 0, \ \ i = 1\ldo...
...( \alpha^{\star}_{i-l} - C \right) = 0, \ \ i = (l+1)\ldots 2l
\end{displaymath} (26)


\begin{displaymath}\boldsymbol{\lambda}^{low} \geq \boldsymbol{0}
\end{displaymath} (27)


 \begin{displaymath}\boldsymbol{\lambda}^{up} \geq \boldsymbol{0}
\end{displaymath} (28)


 \begin{displaymath}(\boldsymbol{\alpha}-\boldsymbol{\alpha}^{\star})^{\raisebox{2pt}{\tiny T}} \, \boldsymbol{1} = 0
\end{displaymath} (29)


 \begin{displaymath}\boldsymbol{0} \leq \boldsymbol{\alpha}^{\star}, \ \boldsymbol{\alpha} \leq \boldsymbol{C}.
\end{displaymath} (30)

Note that if $\lambda^{low}_i > 0$, then the corresponding variable is equal to 0. Also, if $\lambda^{up}_i > 0$, then the corresponding variable is equal to C. The idea is thus to search at each iteration for variables $\boldsymbol{\lambda}^{up}$, $\boldsymbol{\lambda}^{low}$ and $\lambda^{eq}$that verify as well as possible3 the equations (24)-(28), and to remove a variable whose value is equal to 0 if its coefficient $\lambda^{low}_i$ is strictly positive (or just above a constant $\epsilon_{shrink}$) during a given number of iterations. Using the same idea, we also eliminate a variable whose value is equal to Cif its coefficient $\lambda^{up}_i$ stays strictly positive for a sufficient number of iterations.

To estimate $\boldsymbol{\lambda}^{low}$ or $\boldsymbol{\lambda}^{up}$, we start by estimating $\lambda^{eq}$ (note that if $0 < \alpha_i < C$ then $\lambda^{low}_i = \lambda^{up}_i = 0$ and if $0 < \alpha^{\star}_i < C$ then $\lambda^{low}_{i+l} = \lambda^{up}_{i+l} = 0$). Noting

\begin{displaymath}A = \{ i, \ 0 < \alpha_i < C \}, \ \ B = \{ i, \ 0 < \alpha^{\star}_i < C \}\end{displaymath}

we have (with $\hat{\lambda}$ standing for an estimation of $\lambda$) :

 \begin{displaymath}\hat{\lambda}^{eq} = \frac{1}{\vert A\cup B\vert} \( \sum_{i ...
...i^{'}(\boldsymbol{\alpha},\, \boldsymbol{\alpha}^{\star})
\) .
\end{displaymath} (31)

Then if we have

 \begin{displaymath}\begin{array}{rclcrcl}
\alpha_i = 0 & \textrm{we compute} & \...
... = & \hat{\lambda}^{eq} & - & \mathcal{W}_{i+l}^{'}
\end{array}\end{displaymath} (32)

and if a variable stays a sufficient number of iterations at 0 (or C) with its corresponding coefficient $\hat{\lambda}^{low}_j > \epsilon_{shrink}$(or $\hat{\lambda}^{up}_j > \epsilon_{shrink}$), then we remove it from the problem. Note that this can lead to an incorrect solution if $\epsilon_{shrink}$ or the number of iterations before removing a variable is too small. This is verified in the experimental section.


next up previous
Next: Termination Criterion Up: The Decomposition Algorithm Previous: Solving the Subproblem
Journal of Machine Learning Research