2 - Computation
from Part I - Introduction to the four themes
Published online by Cambridge University Press: 04 August 2010
Summary
Many of the algorithms used for biological sequence analysis are discrete algorithms, i.e., the key feature of the problems being solved is that some optimization needs to be performed on a finite set. Discrete algorithms are complementary to numerical algorithms, such as Expectation Maximization, Singular Value Decomposition and Interval Arithmetic, which make their appearance in later chapters. They are also distinct from algebraic algorithms, such as the Buchberger Algorithm, which is discussed in Section 3.1. In what follows we introduce discrete algorithms and mathematical concepts which are relevant for biological sequence analysis. The final section of this chapter offers an annotated list of the computer programs which are used throughout the book. The list ranges over all three themes (discrete, algebraic, numerical) and includes software tools which are useful for research in computational biology.
Some discrete algorithms arise naturally from algebraic statistical models, which are characterized by finitely many polynomials, each with finitely many terms. Inference methods for drawing conclusions about missing or hidden data depend on the combinatorial structure of the polynomials in the algebraic representation of the models. In fact, many widely used dynamic programming methods, such as the Needleman–Wunsch algorithm for sequence alignment, can be interpreted as evaluating polynomials, albeit with tropical arithmetic.
The combinatorial structure of a polynomial, or polynomial map, is encoded in its Newton polytope. Thus every algebraic statistical model has a Newton polytope, and it is the structure of this polytope which governs dynamic programming related to that model.
- Type
- Chapter
- Information
- Algebraic Statistics for Computational Biology , pp. 43 - 84Publisher: Cambridge University PressPrint publication year: 2005
- 1
- Cited by