We propose to select a new set of *q*
variables such that the overall criterion will
be optimized. In order to select such a working set, we use the same
idea as Joachims [5]: simply search for
the optimal gradient-descent direction
**p** which is *feasible*
and which has only *q* non-null components. The variables corresponding to
these components are chosen to be the new working set
.

We thus need to minimize:

with

subject to:

and

Since we are searching for an optimal descent direction, which is a
direction where the scalar product with the gradient is the smallest,
we want indeed to minimize (6). The conditions
(7) and (8) are necessary
to ensure the feasibility of the obtained direction. The condition
(9) is there only to ensure that the problem has a
solution. Finally, (10) is imposed because we are
searching for a direction with only *q* non-null components.

Note that the derivative of
can be easily computed:

In order to solve this problem, it thus suffices to consider

where for and for . To simplify, let

and let us select also the

Since we are searching for exactly *q* variables, the
must
be distinct. We could have to reduce *q* if one variable is selected
twice.

is a solution of the minimization problem (6).

**Proof:** Let us go back to the minimization
problem of the function

subject to

and

with . (The reasoning is the same if we take the constraints (8) into account).

In the case where
2*l* = *q* = 2*r*, it is easy to see that the minimum is obtained
for *z*_{i} = -1 if
and *z*_{i} = 1 if
:
if for
instance *z*_{i0} is augmented by
for a
,
then one needs to compensate by
another *z*_{j} to maintain
(12). Since we want to minimize (11),
the best thing to do, knowing that the
are sorted in reverse
order and keeping in mind the constraint (13), is to modify
*z*_{q} and thus to fix
.
Equation (11) is
then augmented by
,
which is a positive
value because the
are sorted and thus we get out of the minimum.

In the case where
2*l* > *q* = 2*r*, suppose we found a
**z**
which is a solution
of (11). Let us denote
the *q* indices of
the components of
**z** which are non-null. Using the same
argument as in the previous paragraph, it is clear that
and that
.
In other words, if
**z** is a solution of our problem, then
we necessarily have
.
Considering again the order
of the ,
it becomes evident that we have to take
and
.

Our new working set
is then composed of the *q* variables
corresponding to the indices *c*_{i} (where an index
corresponds to
and an index *c*_{i} > *l* corresponds to
).