Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Propositional Logic
- 3 Probability Calculus
- 4 Bayesian Networks
- 5 Building Bayesian Networks
- 6 Inference by Variable Elimination
- 7 Inference by Factor Elimination
- 8 Inference by Conditioning
- 9 Models for Graph Decomposition
- 10 Most Likely Instantiations
- 11 The Complexity of Probabilistic Inference
- 12 Compiling Bayesian Networks
- 13 Inference with Local Structure
- 14 Approximate Inference by Belief Propagation
- 15 Approximate Inference by Stochastic Sampling
- 16 Sensitivity Analysis
- 17 Learning: The Maximum Likelihood Approach
- 18 Learning: The Bayesian Approach
- A Notation
- B Concepts from Information Theory
- C Fixed Point Iterative Methods
- D Constrained Optimization
- Bibliography
- Index
17 - Learning: The Maximum Likelihood Approach
Published online by Cambridge University Press: 23 February 2011
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Propositional Logic
- 3 Probability Calculus
- 4 Bayesian Networks
- 5 Building Bayesian Networks
- 6 Inference by Variable Elimination
- 7 Inference by Factor Elimination
- 8 Inference by Conditioning
- 9 Models for Graph Decomposition
- 10 Most Likely Instantiations
- 11 The Complexity of Probabilistic Inference
- 12 Compiling Bayesian Networks
- 13 Inference with Local Structure
- 14 Approximate Inference by Belief Propagation
- 15 Approximate Inference by Stochastic Sampling
- 16 Sensitivity Analysis
- 17 Learning: The Maximum Likelihood Approach
- 18 Learning: The Bayesian Approach
- A Notation
- B Concepts from Information Theory
- C Fixed Point Iterative Methods
- D Constrained Optimization
- Bibliography
- Index
Summary
We discuss in this chapter the process of learning Bayesian networks from data. The learning process is studied under different conditions, which relate to the nature of available data and the amount of prior knowledge we have on the Bayesian network.
Introduction
Consider Figure 17.1, which depicts a Bayesian network structure from the domain of medical diagnosis (we treated this network in Chapter 5). Consider also the data set depicted in this figure. Each row in this data set is called a case and represents a medical record for a particular patient. Note that some of the cases are incomplete, where “?” indicates the unavailability of corresponding data for that patient. The data set is therefore said to be incomplete due to these missing values; otherwise, it is called a complete data set.
A key objective of this chapter is to provide techniques for estimating the parameters of a network structure given both complete and incomplete data sets. The techniques we provide therefore complement those given in Chapter 5 for constructing Bayesian networks. In particular we can now construct the network structure from either design information or by working with domain experts, as discussed in Chapter 5, and then use the techniques discussed in this chapter to estimate the CPTs of these structures from data. We also discuss techniques for learning the network structure itself, although our focus here is on complete data sets for reasons that we state later.
- Type
- Chapter
- Information
- Modeling and Reasoning with Bayesian Networks , pp. 439 - 476Publisher: Cambridge University PressPrint publication year: 2009