Probability

Tim J. Stevens; Wayne Boucher

doi:10.1017/CBO9780511843556.022

21 - Probability

Published online by Cambridge University Press: 05 February 2015

Tim J. Stevens and

Wayne Boucher

Show author details

Tim J. Stevens: Affiliation:
MRC Laboratory of Molecular Biology, Cambridge
Wayne Boucher: Affiliation:
University of Cambridge

Book contents

Get access

Summary

The basics of probability theory

The theory of probability was based on the observation of random physical events, most notably for games of chance. And naturally, calculating accurate probabilities became especially important for people when money was wagered on the outcome. Probability is a way of ascribing numerical values to the possible outcomes to help us understand a random process more fully. This enables us to ask questions like how much more often one event occurs compared to another, but because of the random nature of what we are studying we can never say what the outcome will definitely be. Rather we tend to think of the process in terms of what the long-term proportions of different outcomes are, if the random experiment were repeated a very large number of times, or perhaps if money is involved what a wager on a particular outcome is worth.

Turning to biological systems, some things in living organisms occur as a result of random processes, like the segregation of a parent’s chromosomes among their children or base-pair changes in DNA (such as a result of replication errors or ionising radiation), though, under most circumstances we don’t get to see the actual random event. For the most part we just view the outcomes, sometimes billions of years later in the case of DNA sequence changes. Of course a DNA sequence isn’t actually random, given that it exists to contain biologically meaningful information representing genes and gene control elements etc. which have been selected for their function during evolution, even if the initial mutations were random. Nonetheless for a sufficiently large and unbiased selection of DNA we can treat the sequence as if it were random in order to ask various questions. For example, how often do I find the sub-sequence AAGCTT in a megabase-long region of DNA?

Type: Chapter
Information: Python Programming for Biology
Bioinformatics and Beyond
, pp. 421 - 453

DOI: https://doi.org/10.1017/CBO9780511843556.022 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Durbin, R.M., Eddy, S.R., Krogh, A., and Mitchison, G. (1998). Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (1st edn.). Cambridge: Cambridge University PressCrossRef Google Scholar

Shrake, A., and Rupley, J.A. (1973). Environment and exposure to solvent of protein atoms. Lysozyme and insulin. Journal of Molecular Biology 79(2): 351–371CrossRef Google Scholar PubMed

Hubbard, T.J., and Blundell, T.L. (1987). Comparison of solvent-inaccessible cores of homologous proteins: definitions useful for protein modelling. Protein Engineering 1(3): 159–171CrossRef Google Scholar PubMed

Book contents

21 - Probability

Summary

Access options

References

Save book to Kindle

Save book to Dropbox

Save book to Google Drive