Book contents
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
25 - Hard problems
Published online by Cambridge University Press: 05 February 2015
- Frontmatter
- Contents
- Preface
- Acknowledgements
- 1 Prologue
- 2 A beginners’ guide
- 3 Python basics
- 4 Program control and logic
- 5 Functions
- 6 Files
- 7 Object orientation
- 8 Object data modelling
- 9 Mathematics
- 10 Coding tips
- 11 Biological sequences
- 12 Pairwise sequence alignments
- 13 Multiple-sequence alignments
- 14 Sequence variation and evolution
- 15 Macromolecular structures
- 16 Array data
- 17 High-throughput sequence analyses
- 18 Images
- 19 Signal processing
- 20 Databases
- 21 Probability
- 22 Statistics
- 23 Clustering and discrimination
- 24 Machine learning
- 25 Hard problems
- 26 Graphical interfaces
- 27 Improving speed
- Appendices
- Glossary
- Index
- Plate section
- References
Summary
Solving hard problems
This chapter deals with problems that cannot be readily solved with a straightforward, deterministic algorithm. This includes problems that computer scientists would describe as NP and not P (non-deterministic in polynomial time, but not solvable in polynomial time), which is a way of saying that a problem is not efficiently solvable. Whether a problem is straightforward to solve will depend on the complexity of the system. To take a classic example, solving the gravitational equations for two orbiting masses, like the Sun and Earth, is fairly easy, but adding more masses, e.g. the Moon, Mars etc., makes the problem much harder. The basic equations of the system do not have to be complicated though. Another famous (NP-hard) problem is the travelling salesman problem. Here the objective is to find the shortest route on a tour that goes through all the places on the salesman’s list. The problem is easy to describe, and it is easy to calculate the length of a solution (a route), but the number of combinations grows very quickly with the number of places to visit and so finding the best solution can be difficult. This is somewhat different to a classic optimisation problem, e.g. finding the minimum of a function, where you can typically follow gradients to home in on the answer.
When it comes to biological information there are many situations of this kind, because biology frequently deals with large and interacting systems. For example, determining the structure of a protein generally involves several thousands of atoms and in general we can only ‘solve’ the structure with good experimental data (e.g. from high-resolution X-ray crystallography); it is not sufficient to start with unstructured atoms and a physical model. However, for a complex problem like this, and in a similar vein to measuring a travelling salesman’s route, testing a given solution to see if it is better or worse can be proportionately straightforward. Referring again to protein structures, there are many methods that can quickly calculate the likelihood (or energy) of a structural model.
- Type
- Chapter
- Information
- Python Programming for BiologyBioinformatics and Beyond, pp. 545 - 565Publisher: Cambridge University PressPrint publication year: 2015