Statistics

doi:10.1017/CBO9780511610684.004

1 - Statistics

from Part I - Introduction to the four themes

Published online by Cambridge University Press: 04 August 2010

Lior Pachter and

Bernd Sturmfels

Edited by

L. Pachter and

B. Sturmfels

Show author details

L. Pachter: Affiliation:
University of California, Berkeley
B. Sturmfels: Affiliation:
University of California, Berkeley

Book contents

Get access

Summary

Statistics is the science of data analysis. The data to be encountered in this book are derived from genomes. Genomes consist of long chains of DNA which are represented by sequences in the letters A, C, G or T. These abbreviate the four nucleic acids Adenine, Cytosine, Guanine and Thymine, which serve as fundamental building blocks in molecular biology.

What do statisticians do with their data? They build models of the process that generated the data and, in what is known as statistical inference, draw conclusions about this process. Genome sequences are particularly interesting data to draw conclusions from: they are the blueprint for life, and yet their function, structure, and evolution are poorly understood. Statistical models are fundamental for genomics, a point of view that was emphasized in [Durbin et al., 1998].

The inference tools we present in this chapter look different from those found in [Durbin et al., 1998], or most other texts on computational biology or mathematical statistics: ours are written in the language of abstract algebra. The algebraic language for statistics clarifies many of the ideas central to the analysis of discrete data, and, within the context of biological sequence analysis, unifies the main ingredients of many widely used algorithms.

Algebraic Statistics is a new field, less than a decade old, whose precise scope is still emerging. The term itself was coined by Giovanni Pistone, Eva Riccomagno and Henry Wynn, with the title of their book [Pistone et al., 2000].

Type: Chapter
Information: Algebraic Statistics for Computational Biology , pp. 3 - 42

DOI: https://doi.org/10.1017/CBO9780511610684.004 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2005

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

1 - Statistics

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive