Tracking a Small Set of Experts by Mixing Past Posteriors
Olivier Bousquet, Manfred K. Warmuth;
In this paper, we examine on-line learning problems in which the target
concept is allowed to change over time. In each trial a master algorithm
receives predictions from a large set of n
experts. Its goal is to predict
almost as well as the best sequence of such experts chosen off-line by
partitioning the training sequence into k
+1 sections and then choosing
the best expert for each section. We build on methods developed by
Herbster and Warmuth and consider an open problem posed by
Freund where the experts in the best partition are from a small
pool of size m
, the best expert shifts back and forth
between the experts of the small pool.
We propose algorithms that solve
this open problem by mixing the past posteriors maintained by the master
algorithm. We relate the number of bits needed for encoding the best
partition to the loss bounds of the algorithms.
Instead of paying log n
choosing the best expert in each section we first pay log (n
bits in the bounds for identifying the pool of m
and then log m
bits per new section.
In the bounds we also pay twice for encoding the
boundaries of the sections.