## Asymptotics in Empirical Risk Minimization

** Leila Mohammadi, Sara van de Geer**; 6(68):2027−2047, 2005.

### Abstract

In this paper, we study a two-category classification problem.
We indicate the categories by labels *Y=1* and *Y=-1*.
We observe a covariate, or feature,
*X ∈ *X* ⊂ ℜ ^{d}*.
Consider a collection

*{h*of classifiers indexed by a finite-dimensional parameter

_{a}}*a*, and the classifier

*h*that minimizes the prediction error over this class. The parameter

_{a*}*a**is estimated by the empirical risk minimizer

*â*over the class, where the empirical risk is calculated on a training sample of size

_{n}*n*. We apply the Kim Pollard Theorem to show that under certain differentiability assumptions,

*â*converges to

_{n}*a**with rate

*n*, and also present the asymptotic distribution of the renormalized estimator.

^{-1/3}
For example, let *V _{0}* denote the set
of

*x*on which, given

*X=x*, the label

*Y=1*is more likely (than the label

*Y=-1*). If

*X*is one-dimensional, the set

*V*is the union of disjoint intervals. The problem is then to estimate the thresholds of the intervals. We obtain the asymptotic distribution of the empirical risk minimizer when the classifiers have

_{0}*K*thresholds, where

*K*is fixed. We furthermore consider an extension to higher-dimensional

*X*, assuming basically that

*V*has a smooth boundary in some given parametric class.

_{0}
We also discuss various rates of convergence when the differentiability
conditions are possibly violated. Here, we again restrict ourselves to
one-dimensional *X*. We show that the rate
is *n ^{-1}* in certain cases, and then also obtain the asymptotic distribution for the empirical prediction error.

© JMLR 2005. (edit, beta) |