[go: up one dir, main page]

Skip to main content

Learning Graphical Models

  • Reference work entry
Encyclopedia of Machine Learning
  • 225 Accesses

Synonyms

Bayesian model averaging; Causal discovery; Dynamic bayesian network; Learning bayesian networks

Definition

Learning graphical models (see Graphical Models) means to learn a graphical representation of either a causal or probabilistic model containing the variables X j ∈ { X i }. Although graphical models include more than directed acyclic graphs (DAGs), the focus here shall be on learning DAGs, as that is where the majority of research and application is taking place.

Definition 1 (Directed acyclic graph (DAG)).

A directed acyclic graph (DAG) is a set of variables (nodes, vertices) {X i } and a set of directed arcs (edges) between them such that following the arcs in their given direction can never lead from a variable back to itself.

DAGs parameterized to represent probability distributions are otherwise known as Bayesian networks. Some necessary concepts and notation for discussing the learning of graphical models is given in Table 1.

Learning Graphical Models. Table 1...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  • The PC algorithm and variants were initially documented in Spirtes et al. (1993); their second edition (Spirtes, Glymour, & Scheines, 2000) covers more ground. Their TETRAD IV program is available from their web site http://www.phil.cmu.edu/projects/tetrad/. PC is contained within (and is available also with the weka machine learning platform at http://www.cs.waikato.ac.nz/ml/weka/).

  • A well-known tutorial by David Heckerman (1999) (reprinted without change in Heckerman, 2008) is well worth looking at for background in causal discovery and parameterizing Bayesian networks. A more current review of many of the topics introduced here is to be found in a forthcoming article (Daly, Shen, & Aitken, forthcoming). For other good treatments of parameterization see Cowell, Dawid, Lauritzen, and Spiegelhalter (1999) or Neapolitan (2003).

    Google Scholar 

  • There are a number of useful anthologies in the area of learning graphical models. Learning in Graphical Models (Jordan, 1999) is one of the best, including Heckerman’s tutorial and a variety of excellent reviews of causal discovery methods, such as Markov Chain Monte Carlo search techniques.

    Google Scholar 

  • Textbooks treating the learning of Bayesian networks include Borgelt and Kruse (2002), Neapolitan (2003), Korb and Nicholson (2004), and Koller and Friedman (2009).

    Google Scholar 

  • Aliferis, C. F., Statnikov, A., Tsamardinos, I., Mani, S., & Koutsoukos, X. D. (2010a). Local causal and Markov blanket induction for causal discovery and feature selection for classification. Part I: Algorithms and empirical evaluation. Journal of Machine Learning Research, 11, 171–234.

    MathSciNet  Google Scholar 

  • Aliferis, C. F., Statnikov, A., Tsamardinos, I., Mani, S., & Koutsoukos, X. D. (2010b). Local causal and Markov blanket induction for causal discovery and feature selection for classification. Part II: Analysis and extensions. Journal of Machine Learning Research, 11, 235–284.

    MathSciNet  Google Scholar 

  • Borgelt, C., & Kruse, R. (2002). Graphical models: Methods for data analysis and mining. New York: Wiley.

    MATH  Google Scholar 

  • Bouckaert, R. (1993). Probabilistic network construction using the minimum description length principle. Lecture Notes in Computer Science, 747, 41–48.

    MathSciNet  Google Scholar 

  • Chickering, D. M. (1995). A tranformational characterization of equivalent Bayesian network structures. In P. Besnard & S. Hanks (Eds.), Proceedings of the 11th conference on uncertainty in artificial intelligence, San Francisco (pp. 87–98).

    Google Scholar 

  • Chickering, D. M., & Heckerman, D. (1997). Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning, 29, 181–212.

    MATH  Google Scholar 

  • Chickering, D. M., Heckerman, D., & Meek, C. (2004). Large-sample learning of Bayesian networks is NP-hard. Journal of Machine Learning Research, 5, 1287–1330.

    MathSciNet  Google Scholar 

  • Chickering, D. M., & Meek, C. (2002). Finding optimal Bayesian networks. In Proceedings of the 18th annual conference on uncertainty in AI, San Francisco (pp. 94–102).

    Google Scholar 

  • Chow, C., & Liu, C. (1968). Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory, 14, 462–467.

    MATH  Google Scholar 

  • Cowell, R. G., Dawid, A. P., Lauritzen, St. L., & Spiegelhalter, D. J. (1999). Probabilistic networks and expert systems. New York: Springer.

    MATH  Google Scholar 

  • Cruz-Ramírez, N., Acosta-Mesa, H. G., Barrientos-Martínez, R. E., & Nava-Fernández, L. A. (2006). How good are the Bayesian information criterion and the Minimum Description Length principle for model selection? A Bayesian network analysis. Lecture Notes in Computer Science, 4293, 494–504.

    Google Scholar 

  • Daly, R., Shen, Q., & Aitken, S. (forthcoming). Learning Bayesian networks: Approaches and issues. The Knowledge Engineering Review.

    Google Scholar 

  • Dash, D., & Cooper, G. F. (2004). Model averaging for prediction with discrete Bayesian networks. Journal of Machine Learning Research, 5, 1177–1203.

    MathSciNet  Google Scholar 

  • Guyon, I., Aliferis, C., Cooper, G., Elisseeff, A., Pellet, J.-P., Spirtes, P., et al. (Eds.) (2008). JMLR workshop and conference proceedings: Causation and prediction challenge (WCCI 2008), volume 3. Journal of Machine Learning Research.

    Google Scholar 

  • Heckerman, D. (1999). A tutorial on learning with Bayesian networks. In M. Jordan, (Ed.), Learning in graphical models (pp. 301–354). Cambridge: MIT.

    Google Scholar 

  • Heckerman, D. (2008). A tutorial on learning with Bayesian networks. In Innovations in Bayesian networks (pp. 33–82). Berlin: Springer Verlag.

    Google Scholar 

  • Heckerman, D., Geiger, D., & Chickering, D. M. (1994). Learning Bayesian networks: The combination of knowledge and statistical data. In Lopes de Mantras & D. Poole (Eds.), Proceedings of the tenth conference on uncertainty in artificial intelligence, San Francisco (pp. 293–301).

    Google Scholar 

  • Jordan, M. I. (1999). Learning in graphical models. Cambridge, MA: MIT.

    Google Scholar 

  • Koller, D., & Friedman, N. (2009). Probabilistic graphical models: Principles and techniques. Cambridge, MA: MIT.

    Google Scholar 

  • Korb, K. B., & Nicholson, A. E. (2004). Bayesian artificial intelligence. Boca Raton, FL: CRC.

    MATH  Google Scholar 

  • Larrañaga, P., Kuijpers, C. M. H., Murga, R. H., & Yurramendi, Y. (1996). Learning Bayesian network structures by searching for the best ordering with genetic algorithms. IEEE Transactions on Systems, Man and Cybernetics, Part A, 26, 487–493.

    Google Scholar 

  • Margaritis, D., & Thrun, S. (2000). Bayesian network induction via local neighborhoods. In S. A. Solla, T. K. Leen, & K. R. Müller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 505–511). Cambridge: MIT.

    Google Scholar 

  • Nägele, A., Dejori, M., & Stetter, M. (2007). Bayesian substructure learning – Approximate learning of very large network structures. In Proceedings of the 18th European conference on machine learning; Lecture notes in AI (Vol. 4701, pp. 238–249). Warsaw, Poland.

    Google Scholar 

  • Neapolitan, R. E. (2003). Learning Bayesian networks. Upper Saddle River, NJ: Prentice Hall.

    Google Scholar 

  • Neil, J. R., & Korb, K. B. (1999). The evolution of causal models. In N. Zhong & L. Zhous (Eds.), Third Pacific-Asia Conference on Knowledge Discovery and Datamining (PAKDD-99), Beijing, China (pp. 432–437). New York: Springer.

    Google Scholar 

  • O’Donnell, R., Nicholson, A., Han, B., Korb, K., Alam, M., & Hope, L. (2006). Causal discovery with prior information. In Australasian joint conference on artificial intelligence, (pp. 1162–1167). New York: Springer.

    Google Scholar 

  • Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction and search. In Lecture notes in statistics (Vol. 81). New York: Springer.

    Google Scholar 

  • Spirtes, P., Glymour, C., & Scheines, R. (2000). Causation, prediction and search. (2nd ed.). Cambridge: MIT.

    Google Scholar 

  • Suzuki, J. (1996). Learning Bayesian belief networks based on the minimum description length principle. In L. Saitta (Ed.) Proceedings of the 13th international conference on machine learning, San Francisco (pp. 462–470). Morgan Kaufman.

    Google Scholar 

  • Suzuki, J. (1999). Learning Bayesian belief networks based on the MDL principle: An efficient algorithm using the branch and bound technique. IEEE Transactions on Information and Systems, 82, 356–367.

    Google Scholar 

  • Tsamardinos, I., Brown, L. E., & Aliferis, C. F. (2006). The max-min hill-climbing Bayesian network structure learning algorithm. Machine learning, 65(1), 31–78.

    Google Scholar 

  • Verma, T. S., & Pearl, J. (1990). Equivalence and synthesis of causal models. In Proceedings of the sixth conference on uncertainty in AI, Boston (pp. 220–227). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Wallace, C. S., Korb, K. B., & Dai, H. (1996). Causal discovery via MML. In L. Saitta, (Ed.), Proceedings of the 13th international conference on machine learning, San Francisco (pp. 516–524). Morgan Kaufman.

    Google Scholar 

  • Wong, M. L., Lam, W., & Leung, K. S. (1999). Using evolutionary programming and minimum description length principle for data mining of Bayesian networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(2), 174–178.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this entry

Cite this entry

Korb, K.B. (2011). Learning Graphical Models. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_460

Download citation

Publish with us

Policies and ethics