Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
21 - Probability
Published online by Cambridge University Press: 05 February 2015
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
Summary
The basics of probability theory
The theory of probability was based on the observation of random physical events, most notably for games of chance. And naturally, calculating accurate probabilities became especially important for people when money was wagered on the outcome. Probability is a way of ascribing numerical values to the possible outcomes to help us understand a random process more fully. This enables us to ask questions like how much more often one event occurs compared to another, but because of the random nature of what we are studying we can never say what the outcome will definitely be. Rather we tend to think of the process in terms of what the long-term proportions of different outcomes are, if the random experiment were repeated a very large number of times, or perhaps if money is involved what a wager on a particular outcome is worth.
Turning to biological systems, some things in living organisms occur as a result of random processes, like the segregation of a parent’s chromosomes among their children or base-pair changes in DNA (such as a result of replication errors or ionising radiation), though, under most circumstances we don’t get to see the actual random event. For the most part we just view the outcomes, sometimes billions of years later in the case of DNA sequence changes. Of course a DNA sequence isn’t actually random, given that it exists to contain biologically meaningful information representing genes and gene control elements etc. which have been selected for their function during evolution, even if the initial mutations were random. Nonetheless for a sufficiently large and unbiased selection of DNA we can treat the sequence as if it were random in order to ask various questions. For example, how often do I find the sub-sequence AAGCTT in a megabase-long region of DNA?
- Type
- Chapter
- Information
- Python Programming for BiologyBioinformatics and Beyond, pp. 421 - 453Publisher: Cambridge University PressPrint publication year: 2015