## On Inclusion-Driven Learning of Bayesian Networks

**
Robert Castelo, Tomás Kocka**; 4(Sep):527-574, 2003.

### Abstract

Two or more Bayesian network structures are Markov equivalent when the corresponding acyclic digraphs encode the same set of conditional independencies. Therefore, the search space of Bayesian network structures may be organized in equivalence classes, where each of them represents a different set of conditional independencies. The collection of sets of conditional independencies obeys a partial order, the so-called "inclusion order."

This paper discusses in depth the role that the inclusion order plays in
learning the structure of Bayesian networks. In particular, this role involves
the way a learning algorithm traverses the search space. We introduce a
condition for traversal operators, the *inclusion boundary condition*,
which, when it is satisfied, guarantees that the search strategy can avoid
local maxima. This is proved under the assumptions that the data is sampled
from a probability distribution which is *faithful* to an acyclic digraph,
and the length of the sample is unbounded.

The previous discussion leads to the design of a new traversal operator and two new learning algorithms in the context of heuristic search and the Markov Chain Monte Carlo method. We carry out a set of experiments with synthetic and real-world data that show empirically the benefit of striving for the inclusion order when learning Bayesian networks from data.