next up previous
Next: About this document ... Up: Learning with Mixtures of Previous: Appendix A.

Bibliography

Bishop 1999
Bishop, C. M.
1999.
Latent variable models.
In M. I. Jordan (), Learning in Graphical Models.
Cambridge, MA: MIT Press.

Blake, Merz 1998
Blake, C. Merz, C.
1998.
UCI Repository of Machine Learning Databases.
http://www.ics.uci.edu/$\sim$mlearn/MLRepository.html.

Boutilier, Friedman, Goldszmidt Koller 1996
Boutilier, C., Friedman, N., Goldszmidt, M. Koller, D.
1996.
Context-specific independence in Bayesian networks.
In Proceedings of the 12th Conference on Uncertainty in AI ( 64-72).
Morgan Kaufmann.

Buntine 1996
Buntine, W.
1996.
A guide to the literature on learning graphical models.
IEEE Transactions on Knowledge and Data Engineering, 8, 195-210.

Cheeseman, Stutz 1995
Cheeseman, P. Stutz, J.
1995.
Bayesian classification (AutoClass): Theory and results.
In U. Fayyad, G. Piatesky-Shapiro, P. Smyth Uthurusamy (), Advances in Knowledge Discovery and Data Mining ( 153-180).
AAAI Press.

Cheng, Bell, Liu 1997
Cheng, J., Bell, D. A. Liu, W.
1997.
Learning belief networks from data: an information theory based approach.
In Proceedings of the Sixth ACM International Conference on Information and Knowledge Management.

Chow, Liu 1968
Chow, C. K. Liu, C. N.
1968.
Approximating discrete probability distributions with dependence trees.
IEEE Transactions on Information Theory, IT-14(3), 462-467.

Cooper, Herskovits 1992
Cooper, G. F. Herskovits, E.
1992.
A Bayesian method for the induction of probabilistic networks from data.
Machine Learning, 9, 309-347.

Cormen, Leiserson, Rivest 1990
Cormen, T. H., Leiserson, C. E. Rivest, R. R.
1990.
Introduction to Algorithms.
Cambridge, MA: MIT Press.

Cowell, Dawid, Lauritzen, Spiegelhalter 1999
Cowell, R. G., Dawid, A. P., Lauritzen, S. L. Spiegelhalter, D. J.
1999.
Probabilistic Networks and Expert Systems.
New York, NY: Springer.

Dayan, Zemel 1995
Dayan, P. Zemel, R. S.
1995.
Competition and multiple cause models.
Neural Computation, 7(3), 565-579.

Dempster, Laird, Rubin 1977
Dempster, A. P., Laird, N. M. Rubin, D. B.
1977.
Maximum likelihood from incomplete data via the EM algorithm.
Journal of the Royal Statistical Society, B, 39, 1-38.

Fredman, Tarjan 1987
Fredman, M. L. Tarjan, R. E.
1987.
Fibonacci heaps and their uses in improved network optimization algorithms.
Journal of the Association for Computing Machinery, 34(3), 596-615.

Frey, Hinton, Dayan 1996
Frey, B. J., Hinton, G. E. Dayan, P.
1996.
Does the wake-sleep algorithm produce good density estimators?
In D. Touretzky, M. Mozer M. Hasselmo (), Neural Information Processing Systems ( 661-667).
Cambridge, MA: MIT Press.

Friedman 1998
Friedman, N.
1998.
The Bayesian structural EM algorithm.
In Proceedings of the 14th Conference on Uncertainty in AI ( 129-138).
San Francisco, CA: Morgan Kaufmann.

Friedman, Geiger, Goldszmidt 1997
Friedman, N., Geiger, D. Goldszmidt, M.
1997.
Bayesian network classifiers.
Machine Learning, 29, 131-163.

Friedman, Getoor 1999
Friedman, N. Getoor, L.
1999.
Efficient learning using constrained sufficient statistics.
In Proceedings of the 7th International Workshop on Artificial Intelligence and Statistics (AISTATS-99).

Friedman, Getoor, Koller, Pfeffer 1996
Friedman, N., Getoor, L., Koller, D. Pfeffer, A.
1996.
Learning probabilistic relational models.
In Proceedings of the 16th International Joint Conference on Artificial Intelligence (IJCAI) ( 1300-1307).

Friedman, Goldszmidt, Lee 1998
Friedman, N., Goldszmidt, M. Lee, T.
1998.
Bayesian network classification with continous attributes: Getting the best of both discretization and parametric fitting.
In Proceedings of the International Conference on Machine Learning (ICML).

Geiger 1992
Geiger, D.
1992.
An entropy-based learning algorithm of Bayesian conditional trees.
In Proceedings of the 8th Conference on Uncertainty in AI ( 92-97).
Morgan Kaufmann Publishers.

Geiger, Heckerman 1996
Geiger, D. Heckerman, D.
1996.
Knowledge representation and inference in similarity networks and Bayesian multinets.
Artificial Intelligence, 82, 45-74.

Hastie, Tibshirani 1996
Hastie, T. Tibshirani, R.
1996.
Discriminant analysis by mixture modeling.
Journal of the Royal Statistical Society B, 58, 155-176.

Heckerman, Geiger, Chickering 1995
Heckerman, D., Geiger, D. Chickering, D. M.
1995.
Learning Bayesian networks: the combination of knowledge and statistical data.
Machine Learning, 20(3), 197-243.

Hinton, Dayan, Frey, Neal 1995
Hinton, G. E., Dayan, P., Frey, B. Neal, R. M.
1995.
The wake-sleep algorithm for unsupervised neural networks.
Science, 268, 1158-1161.

Jelinek 1997
Jelinek, F.
1997.
Statistical Methods for Speech Recognition.
Cambridge, MA: MIT Press.

Jordan, Jacobs 1994
Jordan, M. I. Jacobs, R. A.
1994.
Hierarchical mixtures of experts and the EM algorithm.
Neural Computation, 6, 181-214.

Kontkanen, Myllymaki, Tirri 1996
Kontkanen, P., Myllymaki, P. Tirri, H.
1996.
Constructing Bayesian finite mixture models by the EM algorithm ( C-1996-9).
University of Helsinki, Department of Computer Science.

Lauritzen 1995
Lauritzen, S. L.
1995.
The EM algorithm for graphical association models with missing data.
Computational Statistics and Data Analysis, 19, 191-201.

Lauritzen 1996
Lauritzen, S. L.
1996.
Graphical Models.
Oxford: Clarendon Press.

Lauritzen, Dawid, Larsen, Leimer 1990
Lauritzen, S. L., Dawid, A. P., Larsen, B. N. Leimer, H.-G.
1990.
Independence properties of directed Markov fields.
Networks, 20, 579-605.

MacLachlan, Bashford 1988
MacLachlan, G. J. Bashford, K. E.
1988.
Mixture Models: Inference and Applications to Clustering.
NY: Marcel Dekker.

Meila, Jaakkola 2000
Meila, M. Jaakkola, T.
2000.
Tractable Bayesian learning of tree distributions.
In C. Boutilier M. Goldszmidt (), Proceedings of the 16th Conference on Uncertainty in AI ( 380-388).
San Francisco, CA: Morgan Kaufmann.

Meila, Jordan 1998
Meila, M. Jordan, M. I.
1998.
Estimating dependency structure as a hidden variable.
In M. I. Jordan, M. J. Kearns S. A. Solla (), Neural Information Processing Systems ( 584-590).
MIT Press.

Meila-Predoviciu 1999
Meila-Predoviciu, M.
1999.
Learning with mixtures of trees.
, Massachusetts Institute of Technology.

Michie, Spiegelhalter, Taylor 1994
Michie, D., Spiegelhalter, D. J. Taylor, C. C.
1994.
Machine Learning, Neural and Statistical Classification.
New York: Ellis Horwood.

Monti, Cooper 1998
Monti, S. Cooper, G. F.
1998.
A Bayesian network classfier that combines a finite mixture model and a naive Bayes model ( ISSP-98-01).
University of Pittsburgh.

Moore, Lee 1998
Moore, A. W. Lee, M. S.
1998.
Cached sufficient statistics for efficient machine learning with large datasets.
Journal for Artificial Intelligence Research, 8, 67-91.

Neal, Hinton 1999
Neal, R. M. Hinton, G. E.
1999.
A view of the EM algorithm that justifies incremental, sparse, and other variants.
In M. I. Jordan (), Learning in Graphical Models ( 355-368).
Cambridge, MA: MIT Press.

Ney, Essen KneserNey 1994
Ney, H., Essen, U. Kneser, R.
1994.
On structuring probabilistic dependences in stochastic language modelling.
Computer Speech and Language, 8, 1-38.

Noordewier, Towell, Shavlik 1991
Noordewier, M. O., Towell, G. G. Shavlik, J. W.
1991.
Training knowledge-based neural networks to recognize genes in DNA sequences.
In R. P. Lippmann, J. E. Moody D. S. Touretzky (), Advances in Neural Information Processing Systems ( 530-538).
Morgan Kaufmann Publishers.

Pearl 1988
Pearl, J.
1988.
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
San Mateo, CA: Morgan Kaufman Publishers.

Philips, Moon, Rauss, Rizvi 1997
Philips, P., Moon, H., Rauss, P. Rizvi, S.
1997.
The FERET evaluation methodology for face-recognition algorithms.
In Proceedings of the 1997 Conference on Computer Vision and Pattern Recognition.
San Juan, Puerto Rico.

Rasmussen 1996
Rasmussen, C. E., Neal, R. M., Hinton, G. E., Camp, D. van, Revow, M., Ghahramani, Z., Kustra, R. Tibshrani, R.
1996.
The DELVE Manual.
http://www.cs.utoronto.ca/$\sim$delve.

Rissanen 1989
Rissanen, J.
1989.
Stochastic Complexity in Statistical Inquiry.
New Jersey: World Scientific Publishing Company.

Rubin, Thayer 1983
Rubin, D. B. Thayer, D. T.
1983.
EM algorithms for ML factor analysis.
Psychometrika, 47, 69-76.

Saul, Jordan 1999
Saul, L. K. Jordan, M. I.
1999.
A mean field learning algorithm for unsupervised neural networks.
In M. I. Jordan (), Learning in Graphical Models ( 541-554).
Cambridge, MA: MIT Press.

Shafer, Shenoy 1990
Shafer, G. Shenoy, P.
1990.
Probability propagation.
Annals of Mathematics and Artificial Intelligence, 2, 327-352.

Smyth, Heckerman, Jordan 1997
Smyth, P., Heckerman, D. Jordan, M. I.
1997.
Probabilistic independence networks for hidden Markov probability models.
Neural Computation, 9, 227-270.

Thiesson, Meek, Chickering, Heckerman 1997
Thiesson, B., Meek, C., Chickering, D. M. Heckerman, D.
1997.
Learning mixtures of Bayes networks ( MSR-POR-97-30).
Microsoft Research.

Watson, Hopkins, Roberts, Steitz, Weiner 1987
Watson, J. D., Hopkins, N. H., Roberts, J. W., Steitz, J. A. Weiner, A. M.
1987.
Molecular Biology of the Gene ( I, 4 ).
Menlo Park, CA: The Benjamin/Cummings Publishing Company.

West 1996
West, D. B.
1996.
Introduction to Graph Theory.
Upper Saddle River, NJ: Prentice Hall.


Journal of Machine Learning Research 2000-10-19