Markov chains and hidden Markov models

Mark Borodovsky; Svetlana Ekisheva

doi:10.1017/CBO9780511617829.004

3 - Markov chains and hidden Markov models

Published online by Cambridge University Press: 06 January 2010

Mark Borodovsky and

Svetlana Ekisheva

Show author details

Mark Borodovsky: Affiliation:
Georgia Institute of Technology
Svetlana Ekisheva: Affiliation:
Georgia Institute of Technology

Book contents

Get access

Summary

The chapter in BSA that introduces Markov chains and hidden Markov models plays a critical role in that book. The sequence comparison algorithms described in Chapter 2 could not be developed without the introduction of the theoretically justified similarity scores and statistical theory of similarity score distributions. These developments, in turn, are not feasible without rational choices of probabilistic models for DNA and protein sequences. Both Markov chains and hidden Markov models are often remarkably good candidates for the sequence models. Moreover, hidden Markov models (HMMs) are potentially a more flexible means for biological sequence analysis because they allow simultaneous modeling of observable and non-observable (hidden) states. The presence of the two types of states perfectly fits the need to model some important additional information existing beyond sequences per se, such as the functional meaning of the sequence elements, matches and mismatches of symbols in pairs of aligned sequences, evolutionary conserved regions in multiple sequences, phylogenetic relationships, etc.

Chapter 3 of BSA introduces the fundamental algorithms of HMM theory: the Viterbi algorithm, the forward and backward algorithms, as well as the Baum–Welch algorithm. All of these algorithms are amenable for a variety of applications in biological sequence analysis. Of course, some of these HMM constructions exist in parallel with their non-probabilistic counterparts; for example, consider the Viterbi algorithm for a pair HMM and the classic dynamic programming algorithm for pairwise alignment. Both HMM and non-HMM approaches are known for finding conserved domains, building phylogenetic trees, etc.

In this chapter, the BSA problems focus on deriving the formulas that support probabilistic modeling and the HMM algorithm construction.

Type: Chapter
Information: Problems and Solutions in Biological Sequence Analysis , pp. 67 - 103

DOI: https://doi.org/10.1017/CBO9780511617829.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

3 - Markov chains and hidden Markov models

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive