Skip to main content Accessibility help
  • Print publication year: 2016
  • Online publication date: June 2016

10 - Perspective: Systematics in the age of genomics

from Part III - Next Generation Challenges and Questions


The advent of the data age

The study of the DNA and protein record is a key component of systematic biology research. Recent technological advances in genome science have enabled researchers to routinely generate unprecedented, genome-scale, amounts of sequence data, opening the floodgates for the study of the genome content and function of any organism across the Tree of Life (ToL) (Rokas and Abbot 2009). The main catalyst for these changes has been the development of several different so-called next generation DNA sequencing technologies (NGS) that are capable of producing orders of magnitude more data, for orders of magnitude lower cost than Sanger sequencing approaches (Glenn 2011).

Astonishingly, the amount of sequence data that a single NGS machine currently produces in a few days is larger than the total amount of sequence data collected by individual users via traditional methods that is deposited in GenBank (Gilad et al. 2009). This phenomenal increase in data generation has not only enabled the collection of more sequence data, but also the systematic collection of new types of sequence data (e.g. microRNAs, SINEs, LINEs and other rare genomic changes) that were previously laborious to obtain, as well as the development of new protocols (e.g. RAD-Tags, Baird et al. 2008) and computational pipelines (e.g. phylogenomics, Hittinger et al. 2010b; metagenomics, Patil et al. 2011) for doing so. Furthermore, NGS technologies yield not only qualitative information about the sequence of every DNA fragment analysed, but also quantitative information about the relative abundance of each DNA fragment in the library sequenced (Rokas and Abbot 2009).

The abundance of NGS data, their qualitative and quantitative nature, and their applicability to the study of any organism for which fresh DNA or RNA is available (this volume, Chapter 14) has enabled researchers to adopt NGS not only for answering old questions with new data rigour, but also for formulating and tackling a new ‘generation’ of questions (Rokas and Abbot 2009).

Abzhanov, A., Extavour, C. G., Groover, A., et al. (2008). Are we there yet? Tracking the development of new model systems. Trends in Genetics, 24, 353–60.
Baird, N. A., Etter, P. D., Atwood, T. S., et al. (2008). Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One, 3, e3376.
Benson, D. A., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. and Sayers, E. W. (2009). GenBank. Nucleic Acids Research, 37, D26–31.
Campbell, M. A., Rokas, A. and Slot, J. C. (2012). Horizontal transfer and death of a fungal secondary metabolic gene cluster. Genome Biology and Evolution, 4, 289–93.
Darwin, C. (1859). On the Origin of Species. London, John Murray.
Domazet-Loso, T., Brajkovic, J. and Tautz, D. (2007). A phylostratigraphy approach to uncover the genomic history of major adaptations in metazoan lineages. Trends in Genetics, 23, 533–9.
Domazet-Loso, T. and Tautz, D. (2010a). A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature, 468, 815–8.
Domazet-Loso, T. and Tautz, D. (2010b). Phylostratigraphic tracking of cancer genes suggests a link to the emergence of multicellularity in metazoa. BMC Biology, 8, 66.
Edwards, S. V. (2009). Is a new and general theory of molecular systematics emerging?Evolution, 63, 1–19.
Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17, 368–76.
Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap. Evolution, 39, 783–91.
Felsenstein, J. (2003). Inferring Phylogenies. Sunderland, MA, Sinauer.
Fleischmann, R. D., Adams, M. D., White, O., et al. (1995). Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 269, 496–512.
Genome 10K Community of Scientists (2009). Genome 10K, a proposal to obtain whole-genome sequence for 10,000 vertebrate species. Journal of Heredity, 100, 659–74.
Gibbons, J. G., Janson, E. M., Hittinger, C. T., et al. (2009). Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Molecular Biology and Evolution, 26, 2731–44.
Gilad, Y., Pritchard, J. K. and Thornton, K. (2009). Characterizing natural variation using next-generation sequencing technologies. Trends in Genetics, 25, 463–71.
Glenn, T. C. (2011). Field guide to next-generation DNA sequencers. Molecular Ecology Resources, 11, 759–69.
Goffeau, A., Barrell, B. G., Bussey, H., et al. (1996). Life with 6000 genes. Science, 274, 546, 563–7.
Hillis, D. M. (2010). Phylogenetic progress and applications of the tree of life. In Evolution Since Darwin: The First 150 Years, ed. Bell, M., Futuyma, D. J., Eanes, W. F. and Levinton, J. S.. Sunderland, MA, Sinauer; pp. 421–49.
Hittinger, C. T., Goncalves, P., Sampaio, J. P., et al. (2010a). Remarkably ancient balanced polymorphisms in a multi-locus gene network. Nature, 464, 54–8.
Hittinger, C. T., Johnston, M., Tossberg, J. T. and Rokas, A. (2010b). Leveraging skewed transcript abundance by RNA-Seq to increase the genomic depth of the tree of life. Proceedings of the National Academy of Sciences of the United States of America, 107, 1476–81.
Huber, J. A., Mark Welch, D. B., Morrison, H. G., et al. (2007). Microbial population structures in the deep marine biosphere. Science, 318, 97–100.
Kahn, S. D. (2011). On the future of genomic data. Science, 331, 728–9.
Koonin, E. V. (2009). Evolution of genome architecture. The International Journal of Biochemistry and Cell Biology, 41, 298–306.
Kumar, S., Filipski, A. J., Battistuzzi, F. U., Kosakovsky Pond, S. L. and Tamura, K. (2012). Statistics and truth in phylogenomics. Molecular Biology and Evolution, 29, 457–72.
Lee, E. K., Cibrian-Jaramillo, A., Kolokotronis, S. O., et al. (2011). A functional phylogenomic view of the seed plants. PLoS Genetics, 7, e1002411.
Lynch, M. (2007). The Origins of Genome Architecture. Sunderland, MA, Sinauer.
Lynch, M., Sung, W., Morris, K., et al. (2008). A genome-wide view of the spectrum of spontaneous mutations in yeast. Proceedings of the National Academy of Sciences of the United States of America, 105, 9272–7.
Maddison, W. P. (1997). Gene trees in species trees. Systematic Biology, 46, 523–36.
Moore, M. J., Bell, C. D., Soltis, P. S. and Soltis, D. E. (2007). Using plastid genome-scale data to resolve enigmatic relationships among basal angiosperms. Proceedings of the National Academy of Sciences of the United States of America, 104, 19363–8.
Nehrt, N. L., Clark, W. T., Radivojac, P. and Hahn, M. W. (2011). Testing the ortholog conjecture with comparative functional genomic data from mammals. PLoS Computational Biology, 7, e1002073.
Patil, K. R., Haider, P., Pope, P. B., et al. (2011). Taxonomic metagenome sequence assignment with structured output models. Nature Methods, 8, 191–2.
Pena-Castillo, L. and Hughes, T. R. (2007). Why are there still over 1000 uncharacterized yeast genes?Genetics, 176, 7–14.
Proctor, L. M. (2011). The Human Microbiome Project in 2011 and beyond. Cell, Host and Microbe, 10, 287–91.
Rokas, A. and Abbot, P. (2009). Harnessing genomics for evolutionary insights. Trends in Ecology and Evolution, 24, 192–200.
Rokas, A. and Carroll, S. B. (2006). Bushes in the tree of life. PLoS Biology, 4, e352.
Rokas, A. and Chatzimanolis, S. (2008). From gene-scale to genome-scale phylogenetics: the data flood in, but the challenges remain. Methods in Molecular Biology, 422, 1–12.
Rokas, A., King, N., Finnerty, J. and Carroll, S. B. (2003a). Conflicting phylogenetic signals at the base of the metazoan tree. Evolution and Development, 5, 346–59.
Rokas, A., Williams, B. L., King, N. and Carroll, S. B. (2003b). Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature, 425, 798–804.
Sanderson, M. J. (2008). Phylogenetic signal in the eukaryotic tree of life. Science, 321, 121–3.
Scannell, D. R., Zill, O. A., Rokas, A., et al. (2011). The awesome power of yeast evolutionary genetics: new genome sequences and strain resources for the Saccharomyces sensu stricto genus. G3, 1, 11–25.
Slot, J. C. and Rokas, A. (2010). Multiple GAL pathway gene clusters evolved independently and by different mechanisms in fungi. Proceedings of the National Academy of Sciences of the United States of America, 107, 10136–41.
Slot, J. C. and Rokas, A. (2011). Horizontal transfer of a large and highly toxic secondary metabolic gene cluster between fungi. Current Biology, 21, 134–9.
Vera, J. C., Wheat, C. W., Fescemyer, H. W., et al. (2008). Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Molecular Ecology, 17, 1636–47.