As a first step, we performed a direct comparison of the performance of C5.0 and C5.0-BOOST (C5.0 called with the parameter -b, i.e., 10 iterations of boosting) to C5.0-RR, a single round robin procedure with C5.0 as the base learning algorithm. Table 4 shows the results of a 10-fold cross-validation on 17 datasets.
The first thing to note is that the performance of C5.0 does indeed improve by about 10% on average if round robin binarization is used as a pre-processing step for multi-class problems. This is despite the fact that C5.0 can handle multi-class problems and does not depend on a class binarization routine. However, the gain is not as consistent and not as big as the gain for RIPPER (Table 2), possibly because RIPPER's average error on the multi-class problems in our study is in general above that of C5.0 (by a factor of 1.122), and therefore allows for larger improvements. A possible explanation for this is that the unordered and ordered binarization schemes used by RIPPER are not very good. This is confirmed by the fact that in a direct comparison (which can easily be computed from Tables 2 and 4), R3 decreases the average error of C5.0 by a factor of 0.838, and the error of C5.0-RR by a factor of 0.923. From this, we can conclude that robin binarization helps RIPPER to outperform C5.0 on multi-class problems.
The direct comparison between round robin classification and boosting shows that the improvement of C5.0-RR over C5.0 is not as large as the improvement provided by boosting: although there are a few cases where round robin outperforms boosting, C5.0-BOOST is much more reliable than C5.0-RR, producing an average error reduction of more than 26% on these 17 datasets. The correlation between the error reduction rates of C5.0-BOOST and C5.0-RR is very weak ( = 0.276), which refutes our earlier hypothesis and brings up the question whether there is a fruitful combination of boosting and round robin classification. The last column of Table 4 answers this question negatively: the results of using round robin classification with C5.0-BOOST as a base learner does--on average--not lead to performance improvements over boosting.
These results are analogous to the results of Schapire (1997) who compared ADABOOST.OC (error-correcting output codes as a binarization scheme for conventional two-class ADABOOST) with ADABOOST.M1 (Freund and Schapire, 1997), ADABOOST's straightforward adaptation for multi-class base learners (a version of which is presumably also implemented in C5.0; Quinlan 1996), and found no significant differences for the base learner C4.5 (Quinlan, 1993), C5.0's predecessor. Similar to our comparison between C5.0-BOOST and round robin binarization, Schapire (1997) also found that boosting outperformed binarization via error-correcting output codes. In subsequent work, Allwein et al. (2000) showed that the performance gain of pairwise classification using ADABOOST as a base learner is on average indiscernible from the performance gain of alternative binarization schemes, including some employing error-correcting output codes (such as ADABOOST.OC).