Pairwise sequence alignments

Tim J. Stevens; Wayne Boucher

doi:10.1017/CBO9780511843556.013

12 - Pairwise sequence alignments

Published online by Cambridge University Press: 05 February 2015

Tim J. Stevens and

Wayne Boucher

Show author details

Tim J. Stevens: Affiliation:
MRC Laboratory of Molecular Biology, Cambridge
Wayne Boucher: Affiliation:
University of Cambridge

Book contents

Get access

Summary

Sequence alignment

The alignment of biological sequences is probably the most widely used operation in bioinformatics. In essence sequences are aligned so that we can determine how similar they are, and from this all sorts of useful information can come, such as whether two sequences are related by evolution (they have a common ancestor) or whether they have a similar biological function. The process of comparison is called alignment because the trickiest part of the process is to say which bits of two sequences are equivalent to one another; how residues of the different sequences can be paired up. Usually when we align sequences we seek to determine the best alignment out of the vast number of possible comparisons by finding the combination of residue pairs, one from each sequence, which gives the highest overall score for similarity.

Once a sequence alignment has been achieved, and assuming you trust the results, you can treat the aligned regions as having a degree of equivalency. If the alignment is good enough you might be able to say, for example, that two DNA sequences relate to the same kind of gene, despite the nucleotides not being exactly the same. It should always be remembered, however, that a sequence alignment can only give a limited amount of information about the underlying biology, but it is often an excellent starting point. Even where the knowledge gained is distinctly incomplete, a sequence alignment is quick to perform and often helpful to guide experiments. You might significantly narrow down the number of possibilities of what a section of DNA or protein could be, or say what it definitely is not, with one simple database search, i.e. doing alignments against a database of well-studied sequences. Sequence alignments are also done in a laboratory setting to guide procedures, for example to determine which part of a protein to investigate.

Type: Chapter
Information: Python Programming for Biology
Bioinformatics and Beyond
, pp. 208 - 231

DOI: https://doi.org/10.1017/CBO9780511843556.013 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Dayhoff, M.O., Schwartz, R., and Orcutt, B.C. (1978). A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure (volume 5, supplement 3 ed.). Washington DC: National Biomedical Research Foundation. pp. 345–352Google Scholar

Henikoff, S., and Henikoff, J.G. (1992). Amino acid substitution matrices from protein blocks. PNAS 89(22): 10915–10919CrossRef Google Scholar PubMed

Needleman, S.B., and Wunsch, C.D. (1970). A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology 48(3): 443–453CrossRef Google Scholar PubMed

Smith, T.F., and Waterman, M.S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology 147: 195–197CrossRef Google Scholar PubMed

Altschul, S., Gish, W., Miller, W., Myers, E., and Lipman, D. (1990). Basic local alignment search tool. Journal of Molecular Biology 215(3): 403–410CrossRef Google Scholar PubMed

Book contents

12 - Pairwise sequence alignments

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive