Book contents
- Frontmatter
- Contents
- Preface
- SECTION I INTRODUCTION AND BIOLOGICAL DATABASES
- SECTION II SEQUENCE ALIGNMENT
- 3 Pairwise Sequence Alignment
- 4 Database Similarity Searching
- 5 Multiple Sequence Alignment
- 6 Profiles and Hidden Markov Models
- 7 Protein Motifs and Domain Prediction
- SECTION III GENE AND PROMOTER PREDICTION
- SECTION IV MOLECULAR PHYLOGENETICS
- SECTION V STRUCTURAL BIOINFORMATICS
- SECTION V GENOMICS AND PROTEOMICS
- APPENDIX
- Index
- Plate section
- References
5 - Multiple Sequence Alignment
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- SECTION I INTRODUCTION AND BIOLOGICAL DATABASES
- SECTION II SEQUENCE ALIGNMENT
- 3 Pairwise Sequence Alignment
- 4 Database Similarity Searching
- 5 Multiple Sequence Alignment
- 6 Profiles and Hidden Markov Models
- 7 Protein Motifs and Domain Prediction
- SECTION III GENE AND PROMOTER PREDICTION
- SECTION IV MOLECULAR PHYLOGENETICS
- SECTION V STRUCTURAL BIOINFORMATICS
- SECTION V GENOMICS AND PROTEOMICS
- APPENDIX
- Index
- Plate section
- References
Summary
A natural extension of pairwise alignment is multiple sequence alignment, which is to align multiple related sequences to achieve optimal matching of the sequences. Related sequences are identified through the database similarity searching described in Chapter 4. As the process generates multiple matching sequence pairs, it is often necessary to convert the numerous pairwise alignments into a single alignment, which arranges sequences in such a way that evolutionarily equivalent positions across all sequences are matched.
There is a unique advantage of multiple sequence alignment because it reveals more biological information than many pairwise alignments can. For example, it allows the identification of conserved sequence patterns and motifs in the whole sequence family, which are not obvious to detect by comparing only two sequences. Many conserved and functionally critical amino acid residues can be identified in a protein multiple alignment. Multiple sequence alignment is also an essential prerequisite to carrying out phylogenetic analysis of sequence families and prediction of protein secondary and tertiary structures. Multiple sequence alignment also has applications in designing degenerate polymerase chain reaction (PCR) primers based on multiple related sequences.
It is theoretically possible to use dynamic programming to align any number of sequences as for pairwise alignment. However, the amount of computing time and memory it requires increases exponentially as the number of sequences increases. As a consequence, full dynamic programming cannot be applied for datasets of more than ten sequences. In practice, heuristic approaches are most often used.
- Type
- Chapter
- Information
- Essential Bioinformatics , pp. 63 - 74Publisher: Cambridge University PressPrint publication year: 2006