A biologist will often turn to computer programming in situations where the amount or the complexity of data is too much to be sensibly handled by spreadsheets, and where no other, more specialised, software exists. Often only a relatively simple program needs to be written to get something useful from biological data, which would otherwise not be available. For biologists, the task of writing a computer program can sometimes seem like a significant barrier, but once the basic programming skills are learned then many possibilities are enabled. This chapter offers an introduction to the Python language and gives some concrete examples of programs that may be useful in molecular biology. However, there is not space to cover all aspects of the language and many of its finer details. For this we recommend further reading, but nonetheless hope this chapter serves to illustrate the basics and to show what is possible.
Python is one of the most popular programming languages and is becoming an increasingly attractive option for the biologist. It is a high-level, general-purpose language that is well supported and relatively easy to learn; indeed it is now taught in mainstream UK schools. Also, it has a large number of external modules, including many relating to mathematics, science and biology. Python is easy to install, if it isn't already installed as standard, and runs on almost all kinds of computer system. In this chapter we will show some of the features and capabilities of Python 3 and then apply this to several example programs to illustrate the sort of things that can be achieved for molecular biology. Python version 2, should you need to work with that instead, is very, very similar and most programs are easily transferred (e.g. using the conversion program 2to3 supplied with Python), although the two versions are not 100% compatible.
Even if you don't intend to use Python in the long-run or for all programming work, it nonetheless serves as a good starting point to learn some of the major principles of many modern computing languages. Python is a high-level language like Perl, Matlab and R, which is directly interpreted when a program is run; there is no distinct compilation step.