Next: Solving the Subproblem Up: The Decomposition Algorithm Previous: The Decomposition Algorithm

## Selection of a New Working Set

We propose to select a new set of q variables such that the overall criterion will be optimized. In order to select such a working set, we use the same idea as Joachims [5]: simply search for the optimal gradient-descent direction p which is feasible and which has only q non-null components. The variables corresponding to these components are chosen to be the new working set .

We thus need to minimize:

 (6)

with

subject to:

 (7)

 (8)

and

 (9)

 (10)

Since we are searching for an optimal descent direction, which is a direction where the scalar product with the gradient is the smallest, we want indeed to minimize (6). The conditions (7) and (8) are necessary to ensure the feasibility of the obtained direction. The condition (9) is there only to ensure that the problem has a solution. Finally, (10) is imposed because we are searching for a direction with only q non-null components.

Note that the derivative of can be easily computed:

In order to solve this problem, it thus suffices to consider

where for and for . To simplify, let q be even, and let us sort the in decreasing order. Let us then denote as the bijection of into itself such that the are sorted. Let us then select the q/2 first indices such that:

and let us select also the q/2 last indices such that:

Since we are searching for exactly q variables, the must be distinct. We could have to reduce q if one variable is selected twice.

Lemma 1   Let us now denote , the q indices we just chose, then the direction such that

is a solution of the minimization problem (6).

Proof: Let us go back to the minimization problem of the function

 (11)

subject to

 (12)

 (13)

and

 (14)

with . (The reasoning is the same if we take the constraints (8) into account).

In the case where 2l = q = 2r, it is easy to see that the minimum is obtained for zi = -1 if and zi = 1 if : if for instance zi0 is augmented by for a , then one needs to compensate by another zj to maintain (12). Since we want to minimize (11), the best thing to do, knowing that the are sorted in reverse order and keeping in mind the constraint (13), is to modify zq and thus to fix . Equation (11) is then augmented by , which is a positive value because the are sorted and thus we get out of the minimum.

In the case where 2l > q = 2r, suppose we found a z which is a solution of (11). Let us denote the q indices of the components of z which are non-null. Using the same argument as in the previous paragraph, it is clear that and that . In other words, if z is a solution of our problem, then we necessarily have . Considering again the order of the , it becomes evident that we have to take and .

Our new working set is then composed of the q variables corresponding to the indices ci (where an index corresponds to and an index ci > l corresponds to ).

Next: Solving the Subproblem Up: The Decomposition Algorithm Previous: The Decomposition Algorithm
Journal of Machine Learning Research