Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
24 - Machine learning
Published online by Cambridge University Press: 05 February 2015
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
Summary
A guide to machine learning
When using computers to solve scientific problems there can be situations where you have some measured data and a related property of the data, but there is no known or fixed formula to link the two. Sometimes the link between the two sets of data may be easy for a human to see, but otherwise difficult to encode in a computer algorithm. A simple example of this would be in the reading of handwriting; humans do not write in a fixed typeface, every letter of a given kind will be written slightly differently, and yet we can read most other people’s handwriting without much effort. When we look at writing we attempt to recognise the letters and words, and where there is ambiguity we can use our intelligence to infer what was intended by using the context of what the writing means, or any other clues that we can glean. Writing a computer program to read handwriting is difficult, and not nearly as reliable as a person would be. Nevertheless it can be done, and is put to good use in the mechanised sorting of mail by postal (zip) code. The common trick to getting a computer to perform tasks like this is not to program it with a designed and elaborate rule, but rather to bestow a computer program with a degree of artificial intelligence so that it can come up with its own rules and learn. The exercise whereby a program comes up with its own rules to solve a problem is often referred to as machine learning. It should be noted, however, that we usually don’t expect a computer to learn a task perfectly; if perfection were possible we generally wouldn’t have to resort to such means. Instead it is best to think of machine learning algorithms as making predictions, and as such the predictive power should be tested before we make reliance upon it. There are two kinds of machine learning which are commonly discussed, supervised learning and unsupervised learning, and we will give examples of both in this chapter.
- Type
- Chapter
- Information
- Python Programming for BiologyBioinformatics and Beyond, pp. 511 - 544Publisher: Cambridge University PressPrint publication year: 2015