Profiles and Hidden Markov Models

Jin Xiong

doi:10.1017/CBO9780511806087.007

6 - Profiles and Hidden Markov Models

Published online by Cambridge University Press: 05 June 2012

Jin Xiong

Show author details

Jin Xiong: Affiliation:
Texas A & M University

Book contents

Get access

Summary

One of the applications of multiple sequence alignments in identifying related sequences in databases is by construction of position-specific scoring matrices (PSSMs), profiles, and hidden Markov models (HMMs). These are statistical models that reflect the frequency information of amino acid or nucleotide residues in a multiple alignment. Thus, they can be treated as consensus for a given sequence family. However, the “consensus” is not exactly a single sequence, but rather a model that captures not only the observed frequencies but also predicted frequencies of unobserved characters. The purpose of establishing the mathematical models is to allow partial matches with a query sequence so they can be used to detect more distant members of the same sequence family, resulting in an increased sensitivity of database searches. This chapter covers the basics of these statistical models followed by discussion of their applications.

POSITION-SPECIFIC SCORING MATRICES

A PSSM is defined as a table that contains probability information of amino acids or nucleotides at each position of an ungapped multiple sequence alignment. The matrix resembles the substitution matrices discussed in Chapter 3, but is more complex in that it contains positional information of the alignment. In such a table, the rows represent residue positions of a particular multiple alignment and the columns represent the names of residues or vice versa (Fig. 6.1). The values in the table represent log odds scores of the residues calculated from the multiple alignment.

Type: Chapter
Information: Essential Bioinformatics , pp. 75 - 84

DOI: https://doi.org/10.1017/CBO9780511806087.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2006

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Altschul, S. F., and Koonin, E. V. 1998. Iterated profile searches with PSI-BLAST – A tool for discovery in protein databases. Trends Biochem. Sci. 23:444–7CrossRef Google Scholar PubMed

Baldi, P., Chauvin, Y., Hunkapiller, T., and McClure, M. A. 1994. Hidden Markov models of biological primary sequence information. Proc. Natl. Acad. Sci. U S A 91:1059–63CrossRef Google Scholar PubMed

Eddy, S. R. 1996. Hidden Markov models. Curr. Opin. Struct. Biol. 6:361–5CrossRef Google Scholar PubMed

Eddy, S. R. 1998. Profile hidden Markov models. Bioinformatics 14:755–63CrossRef Google Scholar PubMed

Jones, D. T., and Swindells, M. B. 2002. Getting the most from PSI-BLAST. Trends Biochem. Sci. 27:161–4CrossRef Google Scholar PubMed

Panchenko, A. R., and Bryant, S. H. 2002. A comparison of position-specific score matrices based on sequence and structure alignments. Protein Sci. 11:361–70CrossRef Google Scholar PubMed

Book contents

6 - Profiles and Hidden Markov Models

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive