Introduction
Comprehensive RNA viral surveillance is possible due to new sequencing platforms and reduced cost (Mardis, 2008). This coverage has transformed the resources available to understanding viral evolution. Despite this promise, clinical applications remain lagging in part due to lack of statistical inference tools (Holmes, 2009). Vaccine development and viral eradication rely on determining how viral populations will respond to evolutionary pressures and pinpointing key drug-resistant mutations (Chen and Lee, 2006). The current data infusion brought on by modern viral surveillance may bring the answers to these questions.
We focus on rapidly evolving RNA viruses (Drummond et al., 2003) – a true challenge in vaccine development, as it requires responding to genetically diverse viral populations (Levin et al., 1999). This diversity derives from a fast mutation rate, up to a million-fold times faster than DNA replication, due to poor proofreading (Holland et al., 1982) and a rapid replication time (Belshaw et al., 2008). These conditions result in populations that can be observed evolving on a human timescale (Duffy et al., 2008), providing a uniquely tractable microcosm. Although viral evolution is impacted by difficult to measure effects such as environmental factors and travel, other effects, such as genetic mutation, reassortment, and recombination, are scientifically tractable when couched in the field of phylogenetics (Morens et al., 2004).
In contrast to the 1918 Spanish influenza pandemic, where the three available samples provide an ambiguous evolutionary history (Reid et al., 1999; Gibbs et al., 2001; Worobey et al., 2002; Taubenberger et al., 2005), analysis of twenty-first century outbreaks such as 2009 swine flu will be aided by hundreds (Smith et al., 2009), or thousands, of contemporaneous sequences.