next up previous
Next: Experiments Up: Discrete variables of arbitrary Previous: Computing co-occurrences

Presorting mutual information values

Our goal is to presort the mutual information values $I_{uv}$ for all $v\prec u$ that do not co-occur with $u$. The following theorem shows that this can be done exactly as before. Theorem Let $u,v,w$ be discrete variables such that $v, w$ do not co-occur with $u$ (i.e. $u\neq0\;\Rightarrow \;v=w=0$) in a given dataset ${\cal D}$. Let $N_{v0},N_{w0}$ be the number of datapoints for which $v=0$ and $w=0$ respectively, and let $I_{uv},I_{uw}$ be the respective empirical mutual information values based on the sample ${\cal D}$. Then

\begin{displaymath}
N_{v0} \;>\; N_{w0}\;\;\Rightarrow\;\;I_{uv} \;\leq\;I_{uw}
\end{displaymath}

with equality only if $u$ is identically 0. The proof of the theorem is given in the Appendix. The implication of this theorem is that the ACCL algorithm can be extended to variables taking more than two values by making only one (minor) modification: the replacement of the scalar counts $N_{v}$ and $N_{uv}$ by the vectors $N_v^j, j\neq 0$ and, respectively, the contingency tables $N_{uv}^{ij}, i,j\neq 0$.

Figure 23: Running time for the ACCL (full line) and (dotted line) CHOWLIU algorithms versus number of vertices $n$ for different values of the sparseness $s$.
\begin{figure}
\centerline{\epsfig{file=figures/acc-time.ps,width=3.1in}}
\end{figure}

Figure 24: Number of steps of the Kruskal algorithm $n_K$ versus domain size $n$ measured for the ACCL algorithm for different values of $s$.
\begin{figure}
\centerline{\epsfig{file=figures/acc-nk.ps,width=3.1in}}
\end{figure}


next up previous
Next: Experiments Up: Discrete variables of arbitrary Previous: Computing co-occurrences
Journal of Machine Learning Research 2000-10-19