next up previous
Next: Consistent Dependency Networks Up: Dependency Networks for Inference, Previous: Dependency Networks for Inference,


Introduction

The Bayesian network has proven to be a valuable tool for encoding, learning, and reasoning about probabilistic relationships. In this paper, we introduce another graphical representation of such relationships called a dependency network. The representation can be thought of as a collection of regressions or classifications among variables in a domain that can be combined using the machinery of Gibbs sampling to define a joint distribution for that domain. The dependency network has several advantages and disadvantages with respect to the Bayesian network. For example, a dependency network is not useful for encoding causal relationships and is difficult to construct using a knowledge-based approach. Nonetheless, there are straightforward and computationally efficient algorithms for learning both the structure and probabilities of a dependency network from data; and the learned model is quite useful for encoding and displaying predictive (i.e., dependence and independence) relationships. In addition, dependency networks are well suited to the task of predicting preferences--a task often referred to as collaborative filtering--and are generally useful for probabilistic inference, the task of answering probabilistic queries. In Section 2, we motivate dependency networks from the perspective of data visualization and introduce a special case of the graphical representation called a consistent dependency network. We show, roughly speaking, that such a network is equivalent to a Markov network, and describe how Gibbs sampling is used to answer probabilistic queries given a consistent dependency network. In Section 3, we introduce the dependency network in its general form and describe an algorithm for learning its structure and probabilities from data. Essentially, the algorithm consists of independently performing a probabilistic classification or regression for each variable in the domain. We then show how procedures closely resembling Gibbs sampling can be applied to the dependency network to define a joint distribution for the domain and to answer probabilistic queries. In addition, we provide experimental results on real data that illustrate the utility of this approach, and discuss related work. In Section 4, we describe the task of collaborative filtering and present an empirical study showing that dependency networks are almost as accurate as and computationally more attractive than Bayesian networks on this task. Finally, in Section 5, we describe a data visualization tool based on dependency networks.
next up previous
Next: Consistent Dependency Networks Up: Dependency Networks for Inference, Previous: Dependency Networks for Inference,
Journal of Machine Learning Research, 2000-10-19