** Next:** Comparing mutual information between
** Up:** The accelerated tree learning
** Previous:** The accelerated tree learning

##

The ACCL algorithm

We first present the ACCL algorithm for the case of binary
variables, presenting the extension to general discrete variables
in Section 6.2. For binary variables
we will say that a variable is ``on'' when it takes value 1; otherwise
we say that it is ``off''. Without loss of generality we assume
that a variable is off more times than it is on in the given dataset.
The target distribution is assumed to be derived from a set of
observations of size . Let us denote by the number of times
variable is on in the dataset and by the number of times
variables and are simultaneously on. We call each of the latter
events a *co-occurrence* of and . The marginal of
and is given by:

All the information about that is necessary for fitting the tree
is summarized in the counts , and
,
and from now on we will consider to be represented by these counts.
(It is an easy extension to handle non-integer data, such as when
the data points are ``weighted'' by real numbers).
We now define the notion of sparseness that motivates the ACCL
algorithm. Let us denote by the number of variables that are
on in observation , where
. Define , the
*data sparseness*

If, for example, the data are documents and the variables represent
words from a vocabulary, then represents the maximum number of
distinct words in a document. The time and memory requirements of
the algorithm that we describe depend on the sparseness ; the lower
the sparseness, the more efficient the algorithm. Our algorithm will
realize its largest performance gains when .
Recall that the CHOWLIU algorithm greedily adds edges to a
graph by choosing the edge that currently has the maximal value of mutual
information. The algorithm that we describe involves an efficient
way to rank order mutual information. There are two key aspects
to the algorithm: (1) how to compare mutual information between
non-co-occurring variables, and (2) how to compute co-occurrences
in a list representation.

**Subsections**

** Next:** Comparing mutual information between
** Up:** The accelerated tree learning
** Previous:** The accelerated tree learning
Journal of Machine Learning Research
2000-10-19