Comparing mutual information between non-co-occurring variables

Next: Computing co-occurrences in a Up: The ACCL algorithm Previous: The ACCL algorithm

Comparing mutual information between non-co-occurring variables

Let us focus on the pairs

that do not co-occur, i.e., for which $N_{uv}=0$ . For such a pair, the mutual information $I_{uv}$ is a function of

and

. Let us analyze the variation of the mutual information with respect to $N_{v}$ by taking the corresponding partial derivative:

$\begin{displaymath} \frac{\partial I_{uv}}{\partial N_v}\; = \; \log \frac{N-N_v}{N-N_u-N_v} \;>\;0 \end{displaymath}$

(11)

This result implies that for a given variable

and any two variables

for which $N_{uv} =N_{uv'}=0$ we have:

$\begin{displaymath} N_v>N_{v'}\; {\rm implies}\;{\rm that} \; I_{uv}>I_{uv'}. \end{displaymath}$

This observation allows us to partially sort the mutual information values $I_{uv}$ for non-co-occurring pairs

, without computing them. First, we have to sort all the variables by their number of occurrences

. We store the result in a list

. This gives a total ordering `` $\succ$ '' for the variables in

$\begin{displaymath} v\;\succ \;u\;\;\Leftrightarrow\;\; v\; {\rm preceeds}\; u ... ...rm in}\; {\rm list}\; L \;\; \Rightarrow\;\; N_v\;\geq\;N_u . \end{displaymath}$

For each

, we define the list of variables following

in the ordering `` $\succ$ '' and not co-occurring with it:

$\begin{displaymath} V_0(u)\;=\;\{v\in V,\;v\prec u,\; N_{uv}=0\} . \end{displaymath}$

This list is sorted by decreasing

and therefore, implicitly, by decreasing $I_{uv}$ . Since the data are sparse, most pairs of variables do not co-occur. Therefore, by creating the lists

, a large number of values of the mutual information are partially sorted. Before showing how to use this construction, let us examine an efficient way of computing the $N_{uv}$ counts when the data are sparse.

**Figure 20:** The data structure that supplies the next candidate edge. Vertically on the left are the variables, sorted by decreasing . For a given , there are two lists: , sorted by decreasing $I_{uv}$ and (the virtual list) , sorted by decreasing . The maximum of the two first elements of these lists is inserted into an Fibonacci heap. The overall maximum of $I_{uv}$ can then be extracted as the maximum of the Fibonacci heap.
$\begin{figure} \centerline{\epsfig{file=figures/black-box.eps,width=5in}} \end{figure}$

Next: Computing co-occurrences in a Up: The ACCL algorithm Previous: The ACCL algorithm

Journal of Machine Learning Research 2000-10-19