Biology

doi:10.1017/CBO9780511610684.007

4 - Biology

from Part I - Introduction to the four themes

Published online by Cambridge University Press: 04 August 2010

Lior Pachter and

Bernd Sturmfels

Edited by

L. Pachter and

B. Sturmfels

Show author details

L. Pachter: Affiliation:
University of California, Berkeley
B. Sturmfels: Affiliation:
University of California, Berkeley

Book contents

Get access

Summary

This chapter describes genome sequence data and explains the relevance of the statistics, computation and algebra that we have discussed in Chapters 1–3 to understanding the function of genomes and their evolution. It sets the stage for the studies in biological sequence analysis in some of the later chapters.

Given that quantitative methods play an increasingly important role in many different aspects of biology, the question arises: why the emphasis on genome sequences? The most significant answer is that genomes are fundamental objects that carry instructions for the self-assembly of living organisms. Ultimately, our understanding of human biology will be based on an understanding of the organization and function of our genome. Another reason to focus on genomes is the abundance of high fidelity data. Current finished genome sequences have less than one error in 10,000 bases. Statistical methods can therefore be directly applied to modeling the random evolution of genomes and to making inferences about the structure and organization of functional elements; there is no need to worry about extracting signal from noisy data. Furthermore, it is possible to validate findings with laboratory experiments.

The rate of accumulation of genome sequence data has been extraordinary, far outpacing Moore's law for the increasing density of transistors on circuit chips. This is due to breakthroughs in sequencing technologies and radical advances in automation. Since the first completion of the genome of a free living organism in 1995 (Haemophilus Influenza [Fleischmann et al., 1995]), biologists have completely sequenced over 200 microbial genomes, and dozens of complete invertebrate and vertebrate genomes.

Type: Chapter
Information: Algebraic Statistics for Computational Biology , pp. 125 - 160

DOI: https://doi.org/10.1017/CBO9780511610684.007 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2005

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

4 - Biology

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive