## The Rate of Convergence of AdaBoost

*Indraneel Mukherjee, Cynthia Rudin, Robert E. Schapire*; 14(Aug):2315−2347, 2013.

### Abstract

The AdaBoost algorithm was designed to combine many “weak”
hypotheses that perform slightly better than random guessing
into a “strong” hypothesis that has very low error. We study the
rate at which AdaBoost iteratively converges to the minimum of
the “exponential loss”. Unlike previous work, our proofs do not
require a weak-learning assumption, nor do they require that
minimizers of the exponential loss are finite. Our first result
shows that the exponential loss of AdaBoost's computed parameter
vector will be at most $\varepsilon$ more than that of any
parameter vector of $\ell_1$-norm bounded by $B$ in a number of
rounds that is at most a polynomial in $B$ and $1/\varepsilon$.
We also provide lower bounds showing that a polynomial
dependence is necessary. Our second result is that within
$C/\varepsilon$ iterations, AdaBoost achieves a value of the
exponential loss that is at most $\varepsilon$ more than the
best possible value, where $C$ depends on the data set. We show
that this dependence of the rate on $\varepsilon$ is optimal up
to constant factors, that is, at least $\Omega(1/\varepsilon)$
rounds are necessary to achieve within $\varepsilon$ of the
optimal exponential loss.

[abs][pdf][bib]