Next: Decomposable priors and MAP
Up: Mixtures of trees
Previous: Running time
Learning mixtures of trees with shared structure
It is possible to modify the MIXTREE algorithm so as to
constrain the trees to share the same structure, and thereby
estimate MTSS models.
The E step remains unchanged. The only novelty is the reestimation of
the tree distributions in the M step, since they are now
constrained to have the same structure. Thus, the maximization cannot
be decoupled into separate tree estimations but, remarkably
enough, it can still be performed efficiently.
It can be readily verified that for any given structure the optimal
parameters of each tree edge are equal to the parameters of
the corresponding marginal distribution . It remains only
to find the optimal structure. The expression to be optimized is the
second sum in the righthand side of equation (7). By
replacing with and denoting the mutual
information between and under by this sum can
be reexpressed as follows:
The new quantity appearing above represents the mutual
information of and conditioned on the hidden variable . Its
general definition for three discrete variables distributed
according to
is
The second term in (8),
,
represents the sum of the conditional entropies of the variables given
and is independent of the tree structure. Hence, the optimization
of the structure can be achieved by running a MWST algorithm with the
edge weights represented by . We summarize the algorithm in
Figure 7.
Figure 7:
The MIXTREESS algorithm for learning MTSS models.

Next: Decomposable priors and MAP
Up: Mixtures of trees
Previous: Running time
Journal of Machine Learning Research
20001019