Skip to main content Accessibility help
  • This chapter is unavailable for purchase
  • Print publication year: 2010
  • Online publication date: June 2012

9 - Basic algorithms of bioinformatics



The primary goal of the field of bioinformatics is to examine the biologically important information that is stored, used, and transferred by living things, and how this information acts to control the chemical environment within living organisms. This work has led to technological successes such as the rapid development of potent HIV-1 proteinase inhibitors, the development of new hybrid seeds or genetic variations for improved agriculture, and even to new understanding of pre-historical patterns of human migration. Before discussing useful algorithms for bioinformatic analysis, we must first cover some preliminary concepts that will allow us to speak a common language.

Deoxyribonucleic acid (DNA) is the genetic material that is passed down from parent to offspring. DNA is stored in the nuclei of cells and forms a complete set of instructions for the growth, development, and functioning of a living organism. As a macromolecule, DNA can contain a vast amount of information through a specific sequence of bases: guanine (G), adenine (A), thymine (T), and cytosine (C). Each base is attached to a phosphate group and a deoxyribose sugar to form a nucleotide unit. The four different nucleotides are then strung into long polynucleotide chains, which comprise genes that are thousands of bases long, and ultimately into chromosomes.

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J. (1997) Gapped BLAST and PSI-BLAST: A New Generation of Protein Database Search Programs. Nucleic Acids Res., 25, 3389–402.
Carrillo, H. and Lipman, D. (1988) The Multiple Sequence Alignment Problem in Biology. SIAM J. Appl. Math., 48, 1073–82.
Giasson, B. I., Murray, I. V. J., Trojanowski, J. Q., and Lee, V. M.-Y. (2001) A Hydrophobic Stretch of 12 Amino Acid Residues in the Middle of a-Synuclein Is Essential for Filament Assembly. J. Biol. Chem., 276, 2380–6.
Green, C. E., Pearson, D. N., Camphausen, R. T., Staunton, D. E., and Simon, S. I. (2004) Shear-dependent Capping of L-selectin and P-selectin Glycoprotein Ligand 1 by E-selectin Signals Activation of High-avidity Beta 2-integrin on Neutrophils. J. Immunol., 284, C705–17.
Hughey, R. and Krogh, A. (1996) Hidden Markov Models for Sequence Analysis: Extension and Analysis of the Basic Method. CABIOS, 12, 95–107.
Ivetic, A., Florey, O., Deka, J., Haskard, D. O., Ager, A., and Ridley, A. J. (2004) Mutagenesis of the Ezrin-Radixin-Moesin Binding Domain of L-selectin Tail Affects Shedding, Microvillar Positioning, and Leukocyte Tethering. J. Biol. Chem., 279, 33 263–72.
Kim, J., Pramanik, S., and Chung, M. J. (1994) Multiple Sequence Alignment Using Simulated Annealing. Bioinformatics, 10, 419–26.
Needleman, S. B. and Wunsch, C. D. (1970) A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins. J. Mol. Biol., 48, 443–53.
Notredame, C. and Higgins, D. (1996) SAGA: Sequence Alignment by Genetic Algorithm. Nucleic Acids Res., 24, 1515–24.
Notredame, C., Higgins, D., and Heringa, J. (2000) T-Coffee: A Novel Method for Multiple Sequence Alignments. J. Mol. Biol., 302, 205–17.
Sneath, P. H. A. and Sokal, R. R. (1973) Numerical Taxonomy (San Francisco, CA: W. H. Freeman and Company), pp 230–4.
Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994) CLUSTAL W: Improving the Sensibility of Progressive Multiple Sequence Alignment Through Sequence Weighting, Positions-Specific Gap Penalties and Weight Matrix Choice. Nucleic Acids Res., 22, 4673–80.