Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis
Tonatiuh Peña Centeno, Neil D. Lawrence; 7(Feb):455--491, 2006.
AbstractIn this paper we consider a novel Bayesian interpretation of Fisher's discriminant analysis. We relate Rayleigh's coefficient to a noise model that minimises a cost based on the most probable class centres and that abandons the 'regression to the labels' assumption used by other algorithms. Optimisation of the noise model yields a direction of discrimination equivalent to Fisher's discriminant, and with the incorporation of a prior we can apply Bayes' rule to infer the posterior distribution of the direction of discrimination. Nonetheless, we argue that an additional constraining distribution has to be included if sensible results are to be obtained. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher's discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.