Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
27 - Improving speed
Published online by Cambridge University Press: 05 February 2015
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
Summary
Running things faster
This chapter is all about how to make Python programs run faster. We will discuss optimising existing routines so that they take a shorter amount of time to run, above and beyond the simple Python tips and tricks discussed earlier. Initially parallel computing, where a job is split into parts and run concurrently on separate processors (or processing cores), is discussed in a basic way. For this we use modules that are available from Python 2.6 and above, which allow programs to take account of multiple processing cores present in a single computer. For the remainder of the chapter we will deal with improving the performance of a single processing job.
At the end some timing results will be given so that the reader can see how much was gained for the effort. For mathematical routines involving lots of loops it is not uncommon to get speed improvements of better than tenfold. The fine details about the logic and underlying algorithms of the examples used here will not be described; an example will be taken from earlier in the book where such things are described fully. Also, which particular example we have chosen is not especially important, other than the fact that it is a computationally intensive one that takes a noticeable time to run. It should be noted that this chapter comes with a ‘health warning’ for novice programmers, because the mainstay of the optimisation will be to move away from Python. Some of the focus will be on the low-level compiled language C, although it will be used in a way to provide a module that can still be used directly inside Python programs. The details of the C language, and how to compile it, will not be discussed and to actually learn to program in C we recommend further reading. Nonetheless, if you have no experience with C we hope that we can provide a basic appreciation of how it can help. We also consider Cython, a C-like extension to Python, which has made it possible to benefit from the speed of C without having to necessarily deal with all the complexities of C. This is particularly powerful in combination with using NumPy arrays.
- Type
- Chapter
- Information
- Python Programming for BiologyBioinformatics and Beyond, pp. 582 - 605Publisher: Cambridge University PressPrint publication year: 2015