Running time

Next: Memory requirements Up: The ACCL algorithm Previous: The algorithm and its

Running time

The algorithm requires: ${\cal O}(sn)$ for computing $N_v, v\in V$ , ${\cal O}(n\log n)$ for sorting the variables, ${\cal O}(s^2N\log(s^2N/n))$ for step 2, ${\cal O}(n)$ for step 3, ${\cal O}(n_K\log n +N_C)$ for the Kruskal algorithm (

is the number of edges examined by the algorithm, $\log n$ is the time for an extraction from

is an upper bound for the number of elements in the lists $\overline{V}_0$ whose elements need to be skipped occasionally when we extract variables from the virtual lists

), and ${\cal O}(n)$ for creating the the

probability tables in step 5. Summing these terms, we obtain the following upper bound for the running time of the acCL algorithm:

$\begin{displaymath} {\cal O}( n\log n + sn + s^2N \log\frac{s^2N}{n} + n_K\log n). \end{displaymath}$

If we ignore the logarithmic factors this simplifies to

$\begin{displaymath} \tilde{{\cal O}}( sn + s^2N + n_K). \end{displaymath}$

For constant

, this bound is a polynomial of degree 1 in the three variables $n,\;N$ and

. Because

has the range $n-1\;\leq\; n_K \;\leq\; {n(n-1)}/{2}$ , the worst case complexity of the ACCL algorithm is quadratic in

. Empirically, however, as we show below, we find that the dependence of

is generally subquadratic. Moreover, random graph theory implies that if the distribution of the weight values is the same for all edges, then Kruskal's algorithm should take time proportional to $n\log n$ [West 1996]. To verify this latter result, we conducted a set of Monte Carlo experiments, in which we ran the Kruskal algorithm on sets of random weights over domains of dimension up to

. For each

, 1000 runs were performed. Figure 22 plots the average and maximum

versus $n\log n$ for these experiments. The curve for the average displays an essentially linear dependence.

**Figure 22:** The mean (full line), standard deviation and maximum (dotted line) of the number of steps in the Kruskal algorithm over 1000 runs plotted against $n\log n$ . ranges from 5 to 3000. The edge weights were sampled from a uniform distribution.
$\begin{figure} \centerline{\epsfig{file=figures/figkruskal.ps,width=2.4in}} \end{figure}$

Next: Memory requirements Up: The ACCL algorithm Previous: The algorithm and its

Journal of Machine Learning Research 2000-10-19