## Exponentiated Gradient Algorithms for Conditional Random Fields and Max-Margin Markov Networks

** Michael Collins, Amir Globerson, Terry Koo, Xavier Carreras, Peter L. Bartlett**; 9(58):1775−1822, 2008.

### Abstract

Log-linear and maximum-margin models are two commonly-used methods in
supervised machine learning, and are frequently used in structured
prediction problems. Efficient learning of parameters in these models
is therefore an important problem, and becomes a key factor when
learning from very large data sets. This paper describes
exponentiated gradient (EG) algorithms for training such models, where
EG updates are applied to the convex dual of either the log-linear or
max-margin objective function; the dual in both the log-linear and
max-margin cases corresponds to minimizing a convex function with
simplex constraints. We study both batch and online variants of the
algorithm, and provide rates of convergence for both cases. In the
max-margin case, *O*(1/ε) EG updates are required to
reach a given accuracy ε in the dual; in contrast, for
log-linear models only *O*(log(1/ε)) updates are
required. For both the max-margin and log-linear cases, our bounds
suggest that the online EG algorithm requires a factor of *n* less
computation to reach a desired accuracy than the batch EG algorithm,
where *n* is the number of training examples. Our experiments confirm
that the online algorithms are much faster than the batch algorithms
in practice. We describe how the EG updates factor in a convenient
way for structured prediction problems, allowing the algorithms to be
efficiently applied to problems such as sequence learning or natural
language parsing. We perform extensive evaluation of the algorithms,
comparing them to L-BFGS and stochastic gradient descent for
log-linear models, and to SVM-Struct for max-margin
models. The algorithms are applied to a multi-class
problem as well as to a more complex large-scale parsing task. In all
these settings, the EG algorithms presented here outperform the other
methods.

© JMLR 2008. (edit, beta) |