This chapter is provided for the sake of completeness and reference. It can be skipped by readers who have a basic knowledge of phylogenetic analysis.
Phylogenetic analysis aims at uncovering the evolutionary relationships between different species or taxa, to obtain an understanding of the evolution of life on Earth. Phylogenetic trees are widely used to address this task and are usually computed from molecular sequences. They also have applications in many other areas. For example, they are used to determine the age and rate of diversification of taxa, to understand the evolutionary history of gene families, in sequence-analysis methods to allow phylogenetic footprinting, in epidemiology to trace the origin and transmission of infectious diseases, or to study the co-evolution of hosts and parasites.
The main focus of this book is on phylogenetic networks. However, as phylogenetic trees generalize to phylogenetic networks and also to make the book reasonably self-contained, in this chapter we give a brief introduction to some of the main methods used to infer phylogenetic trees.
Figure 3.1 shows the relationships between some of the main concepts introduced in this chapter. The focus of this chapter is on how to compute unrooted phylogenetic trees. Usually, the process of phylogenetic inference is begun with a multiple sequence alignment. From this, one can pursue either a distance-based analysis, or a sequenced-based one.
In a distance-based analysis of DNA sequences, first the Hamming distances between pairs of sequences are computed. These distances are then exposed to a distance correction that is based on some appropriate model of evolution.