Adaptation by natural selection is the most important process in biology – nothing else can explain the staggering complexity and diversity of organisms, cells, enzymes, and proteins. From the skeleton of a blue whale to the 100 million times smaller flagellum of a bacterium, all living structures result from the repeated fixation and elimination of genetic variants within populations. It should come as no surprise, then, that a key goal of modern evolutionary biology is the identification of genes or genome regions that have been targeted by natural selection, thereby pinpointing the genotypic variation that causes some individuals to live longer and reproduce faster than others. Great efforts have been made over recent decades to understand the “molecular footprint” left in nucleotide and protein sequences by the past action of natural selection, efforts that have resulted in a substantial framework of theory and an impressive array of statistical methods, of which we give only the barest introduction in this chapter. We begin by considering the dynamics through time of genetic variants within a population, then review the statistical approaches most commonly used to identify natural selection at the molecular level.
To illustrate the essential concepts of molecular adaptation, it is convenient to consider the fate of a new genetic variant (mutant) present in a single individual belonging to a population whose other members are genetically homogeneous and carry a “wild-type” genotype.