Skip to main content Accessibility help
×
Hostname: page-component-8448b6f56d-cfpbc Total loading time: 0 Render date: 2024-04-24T10:28:30.309Z Has data issue: false hasContentIssue false

15 - From sequence reads to evolutionary inferences

from Part III - Next Generation Challenges and Questions

Published online by Cambridge University Press:  05 June 2016

James A. Cotton
Affiliation:
Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
Peter D. Olson
Affiliation:
Natural History Museum, London
Joseph Hughes
Affiliation:
University of Glasgow
James A. Cotton
Affiliation:
Wellcome Trust Sanger Institute, Cambridge
Get access

Summary

Introduction

The history of molecular systematics can be caricatured as one of ever-increasing depth of sequence data, analysed by ever more complex models. In this respect, sequence data from whole genomes are the ultimate source of molecular markers that can act as characters for phylogenetic or population genetic analysis. While complete genomes in the strictest sense are only available for very few species, and fragmentary genome assemblies that capture the entire genome, but in many pieces, are also fairly restricted in scope beyond the prokaryotes, this is changing rapidly. More-or-less shallow genomic data, for example from EST sequencing projects, high-throughput transcriptome sequencing or some other kind of reduced-representation sequencing (see review by Davey et al. 2011) are now becoming widespread and of increasing utility in systematics and other areas of evolutionary biology. Studies using these kinds of data to reconstruct relationships between species have become known as ‘phylogenomics’, although the original usage of the term referred to using phylogenetic approaches to infer gene function (Eisen 1998), and the other parts of the research programme proposed under this name (Eisen and Fraser 2003) have been subsumed into the broader study of comparative and evolutionary genomics. Moreover, the term ‘phylogenomics’ has, perhaps, become over-extended, as datasets that claim this title vary in size and can be as few as 11 markers (Horvath et al. 2008) or as little as 30 kb of sequence data (Wiegmann et al. 2011), and in eukaryotic organisms, the ‘genomes’ in question are very often organelle (mitochondrial or chloroplast) genome sequences. Sequence data from whole genomes have the potential to be a rich source of molecular phylogenetic markers for any systematic question, but there are two areas in which large-scale, highly multi-locus data appear most valuable – occupying the two extremes of the range of timescales over which inference about evolutionary history is made.

Genome-scale data promise the ability to resolve ancient divergences, and in particular, fairly rapid (at least in geological terms) ancient radiations that have been difficult to reliably reconstruct with more limited molecular datasets. In this context, phylogenomic data have been applied to a wide taxonomic range of phylogenetic questions. Early usage of whole-genome data was in prokaryote systematics (e.g. Daubin et al. 2002).

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2016

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Aguinaldo, A. M., Turbeville, J. M., Linford, L. S., et al. (1997). Evidence for a clade of nematodes, arthropods and other moulting animals. Nature, 387, 489–93.CrossRefGoogle ScholarPubMed
Altenhoff, A. M. and Dessimoz, C. (2012). Inferring orthology and paralogy. Methods in Molecular Biology, 855, 259–79.Google ScholarPubMed
Altshuler, D., Pollara, V. J., Cowles, C. R., et al. (2000). An SNP map of the human genome generated by reduced representation shotgun sequencing. Nature, 407, 513–16.Google ScholarPubMed
Ané, C., Larget, B., Baum, D. A., Smith, S. D. and Rokas, A. (2007). Bayesian estimation of concordance among gene trees. Molecular Biology and Evolution, 24, 412–26.CrossRefGoogle ScholarPubMed
Anisimova, M. and Gascuel, O. (2006). Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Systematic Biology, 55, 539–52.CrossRefGoogle ScholarPubMed
Assefa, S., Keane, T. M., Otto, T. D., Newbold, C. and Berriman, M. (2009). ABACAS: algorithm-based automatic contiguation of assembled sequences. Bioinformatics, 25, 1968–9.CrossRefGoogle ScholarPubMed
Baird, N. A., Etter, P. D., Atwood, T. S., et al. (2008). Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One, 3, e3376.CrossRefGoogle ScholarPubMed
Bapteste, E., Susko, E., Leigh, J., et al. (2007). Alternative methods for concatenation of core genes indicate a lack of resolution in deep nodes of the prokaryotic phylogeny. Molecular Biology and Evolution, 25, 83–91.CrossRefGoogle ScholarPubMed
Barry, D. and Hartigan, J. A. (1987). Asynchronous distance between homologous DNA sequences. Biometrics, 43, 261–76.CrossRefGoogle ScholarPubMed
Beaumont, M. A. (2010). Approximate Bayesian computation in evolution and ecology. Annual Review of Ecology, Evolution, and Systematics, 41, 379–406.CrossRefGoogle Scholar
Blackshields, G., Wallace, I. M., Larkin, M. and Higgins, D. G. (2006). Analysis and comparison of benchmarks for multiple sequence alignment. In Silico Biology, 6, 321–39.Google ScholarPubMed
Blair, C. and Murphy, R. W. (2010). Recent trends in molecular phylogenetic analysis: where to next?Journal of Heredity, 102, 130–8.Google ScholarPubMed
Blair, J. E., Ikeo, K., Gojobori, T. and Hedges, S. B. (2002). The evolutionary position of nematodes. BMC Evolutionary Biology, 2, 7.CrossRefGoogle ScholarPubMed
Blanquart, S. and Lartillot, N. (2008). A site- and time-heterogeneous model of amino acid replacement. Molecular Biology and Evolution, 25, 842–58.CrossRefGoogle ScholarPubMed
Boetzer, M. and Pirovano, W. (2012). Toward almost closed genomes with GapFiller. Genome Biology, 13, R56.CrossRefGoogle ScholarPubMed
Bradnam, K. R., Fass, J. N., Alexandrov, A., et al. (2013). Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species. GigaScience, 2, 10.CrossRefGoogle ScholarPubMed
Breese, M. R. and Liu, Y. (2013). NGSUtils: a software suite for analyzing and manipulating next-generation sequencing datasets. Bioinformatics, 29, 494–6.CrossRefGoogle ScholarPubMed
Brown, J. M. and Lemmon, A. R. (2007). The importance of data partitioning and the utility of Bayes factors in Bayesian phylogenetics. Systematic Biology, 56, 643–55.CrossRefGoogle ScholarPubMed
Browning, S. R. and Browning, B. L. (2011). Haplotype phasing: existing methods and new developments. Nature Reviews Genetics, 12, 703–14.CrossRefGoogle ScholarPubMed
Bybee, S. M., Bracken-Grissom, H., Haynes, B. D., et al. (2011). Targeted Amplicon Sequencing (TAS): a scalable next-gen approach to multilocus, multitaxa phylogenetics. Genome Biology and Evolution, 3, 1312–23.CrossRefGoogle ScholarPubMed
Capella-Gutierrez, S., Silla-Martinez, J. M. and Gabaldon, T. (2009). trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics, 25, 1972–73.CrossRefGoogle ScholarPubMed
Carstens, B. C., Pelletier, T. A., Reid, N. M. and Satler, J. D. (2013). How to fail at species delimitation. Molecular Ecology, 22, 4369–83.CrossRefGoogle ScholarPubMed
Castresana, J. (2000). Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Molecular Biology and Evolution, 17, 540–52.CrossRefGoogle ScholarPubMed
Chain, P. S. G., Grafham, D. V., Fulton, R. S., et al. (2009). Genomics: genome project standards in a new era of sequencing. Science, 326, 236–7.CrossRefGoogle Scholar
Choi, S. C. and Hey, J. (2011). Joint inference of population assignment and demographic history. Genetics, 189, 561–77.CrossRefGoogle ScholarPubMed
Ciccarelli, F., Doerks, T., Mering, von, C., et al. (2006). Toward automatic reconstruction of a highly resolved Tree of Life. Science, 311, 1283–7.CrossRefGoogle ScholarPubMed
Compeau, P. E. C., Pevzner, P. A. and Tesler, G. (2011). How to apply de Bruijn graphs to genome assembly. Nature Biotechnology, 29, 987–91.CrossRefGoogle Scholar
Cotton, J. A. and Page, R. D. M. (2005). Rates and patterns of gene duplication and loss in the human genome. Proceedings of the Royal Society B-Biological Sciences, 272, 277–83.CrossRefGoogle ScholarPubMed
Cotton, J. A. and Wilkinson, M. (2009). Supertrees join the mainstream of phylogenetics. Trends in Ecology and Evolution, 24, 1–3.CrossRefGoogle ScholarPubMed
Cox, C. J., Foster, P. G., Hirt, R. P., Harris, S. R. and Embley, T. M. (2008). The archaebacterial origin of eukaryotes. Proceedings of the National Academy of Sciences of the United States of America, 105, 20356–61.CrossRefGoogle ScholarPubMed
Creevey, C. J., Muller, J., Doerks, T., et al. (2011). Identifying single copy orthologs in Metazoa. PLoS Computational Biology, 7, e1002269.CrossRefGoogle ScholarPubMed
Csilléry, K., Blum, M. G. B., Gaggiotti, O. E. and François, O. (2010). Approximate Bayesian Computation (ABC) in practice. Trends in Ecology and Evolution, 25, 410–18.CrossRefGoogle ScholarPubMed
Dagan, T. and Martin, W. (2006). The tree of one percent. Genome Biology, 7, 118.CrossRefGoogle ScholarPubMed
Dalquen, D. A. and Dessimoz, C. (2013). Bidirectional best hits miss many orthologs in duplication-rich clades such as plants and animals. Genome Biology and Evolution, 5, 1800–6.CrossRefGoogle ScholarPubMed
Danecek, P., Auton, A., Abecasis, G. et al. (2011). The variant call format and VCFtools. Bioinformatics, 27, 2156–8.CrossRefGoogle Scholar
Daubin, V., Gouy, M. and Perrière, G. (2002). A phylogenomic approach to bacterial phylogeny: evidence of a core of genes sharing a common history. Genome Research, 12, 1080–90.CrossRefGoogle ScholarPubMed
Davey, J. W., Hohenlohe, P. A., Etter, P. D., et al. (2011). Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nature Reviews Genetics, 12, 499–510.CrossRefGoogle ScholarPubMed
de Koning, A. P. J., Gu, W., Castoe, T. A., Batzer, M. A. and Pollock, D. D. (2011). Repetitive elements may comprise over two-thirds of the human genome. PLoS Genetics, 7, e1002384.CrossRefGoogle ScholarPubMed
de Queiroz, A., Donoghue, M. J. and Kim, J. (1995). Separate versus combined analysis of phylogenetic evidence. Annual Review of Ecology and Systematics, 26, 657–81.CrossRefGoogle Scholar
Degnan, J. H. and Rosenberg, N. A. (2006). Discordance of species trees with their most likely gene trees. PLoS Genetics, 2, e68.CrossRefGoogle ScholarPubMed
DeLuca, D. S., Levin, J. Z., Sivachenko, A., et al. (2012). RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics, 28, 1530–2.CrossRefGoogle ScholarPubMed
Downing, T., Imamura, H., Decuypere, S., et al. (2011). Whole genome sequencing of multiple Leishmania donovani clinical isolates provides insights into population structure and mechanisms of drug resistance. Genome Research, 21, 2143–56.CrossRefGoogle ScholarPubMed
Dunn, C. W., Hejnol, A., Matus, D. Q., et al. (2008). Broad phylogenomic sampling improves resolution of the animal Tree of Life. Nature, 452, 745–9.CrossRefGoogle ScholarPubMed
Dunn, C. W., Howison, M. and Zapata, F. (2013). Agalma: an automated phylogenomics workflow. BMC Bioinformatics, 14, 330.CrossRefGoogle ScholarPubMed
Edgecombe, G. D., Giribet, G., Dunn, C. W., et al. (2011). Higher-level Metazoan relationships: recent progress and remaining questions. Organisms Diversity and Evolution, 11, 151–72.CrossRefGoogle Scholar
Edwards, S. V., Liu, L. and Pearl, D. K. (2007). High-resolution species trees without concatenation. Proceedings of the National Academy of Sciences of the United States of America, 104, 5936–41.CrossRefGoogle ScholarPubMed
Eisen, J. A. (1998). Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Research, 8, 163–7.CrossRefGoogle ScholarPubMed
Eisen, J. A. and Fraser, C. M. (2003). Phylogenomics: intersection of evolution and genomics. Science, 300, 1706–7.CrossRefGoogle ScholarPubMed
Erixon, P., Svennblad, B., Britton, T. and Oxelman, B. (2003). Reliability of Bayesian posterior probabilities and bootstrap frequencies in phylogenetics. Systematic Biology, 52, 665–73.CrossRefGoogle ScholarPubMed
Ewens, W. J. (1972). The sampling theory of selectively neutral alleles. Theoretical Population Biology, 3, 87–112.CrossRefGoogle ScholarPubMed
Excoffier, L., Dupanloup, I., Huerta-Sãnchez, E., Sousa, V. C. and Foll, M. (2013). Robust demographic inference from genomic and SNP data. PLoS Genetics, 9, e1003905.CrossRefGoogle ScholarPubMed
Fedrigo, O., Naylor, G. and Collins, T. (2005). Choosing the best genes for the job: the case for stationary genes in genome-scale phylogenetics. Systematic Biology, 54, 493–500.Google Scholar
Flouri, T., Izquierdo-Carrasco, F., Darriba, D., et al. (2015). The phylogenetic likelihood library. Systematic Biology, 64, 356–62.CrossRefGoogle ScholarPubMed
Fonseca, N. A., Rung, J., Brazma, A. and Marioni, J. C. (2012). Tools for mapping high-throughput sequencing data. Bioinformatics, 28, 3169–77.CrossRefGoogle ScholarPubMed
Foster, P. G. (2004). Modeling compositional heterogeneity. Systematic Biology, 53, 485–95.CrossRefGoogle ScholarPubMed
Galtier, N. and Gouy, M. (1998). Inferring pattern and process: maximum-likelihood implementation of a nonhomogeneous model of DNA sequence evolution for phylogenetic analysis. Molecular Biology and Evolution, 15, 871–9.CrossRefGoogle ScholarPubMed
Gascuel, O. and Steel, M. (2006). Neighbor-joining revealed. Molecular Biology and Evolution, 23, 1997–2000.CrossRefGoogle ScholarPubMed
Gatesy, J. and Baker, R. (2005). Hidden likelihood support in genomic data: can forty-five wrongs make a right?Systematic Biology, 54, 483–92.CrossRefGoogle ScholarPubMed
Gayral, P., Melo-Ferreira, J., Glémin, S., et al. (2013). Reference-free population genomics from next-generation transcriptome data and the vertebrate–invertebrate gap. PLoS Genetics, 9, e1003457.CrossRefGoogle ScholarPubMed
Gee, H. (2003). Evolution: ending incongruence. Nature 425, 782.CrossRefGoogle ScholarPubMed
Gnirke, A., Melnikov, A., Maguire, J., et al. (2009). Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nature Biotechnology, 27, 182–9.CrossRefGoogle ScholarPubMed
Godden, G. T., Jordon-Thaden, I. E. and Chamala, S. (2012). Making next-generation sequencing work for you: approaches and practical considerations for marker development and phylogenetics. Plant Ecology and Diversity, 5, 427–50.CrossRefGoogle Scholar
Goloboff, P. A., Farris, J. S. and Nixon, K. C. (2008). TNT, a free program for phylogenetic analysis. Cladistics, 24, 774–86.CrossRefGoogle Scholar
Goodman, M., Czelusniak, J., Moore, G. W., Romero-Herrera, A. E. and Matsuda, G. (1979). Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms from globin sequences. Systematic Zoology, 28, 132–63.CrossRefGoogle Scholar
Grant, J. R. and Katz, L. A. (2014). Building a phylogenomic pipeline for the eukaryotic tree of life – addressing deep phylogenies with genome-scale data. PLoS Currents Apr, 6.Google ScholarPubMed
Gremme, G., Steinbiss, S. and Kurtz, S. (2013). GenomeTools: a comprehensive software library for efficient processing of structured genome annotations. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10, 645–56.CrossRefGoogle ScholarPubMed
Gronau, I., Hubisz, M. J., Gulko, B., Danko, C. G. and Siepel, A. (2011). Bayesian inference of ancient human demography from individual genome sequences. Nature Genetics, 43, 1031–4.CrossRefGoogle ScholarPubMed
Guindon, S. and Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52, 696–704.CrossRefGoogle ScholarPubMed
Gutenkunst, R. N., Hernandez, R. D., Williamson, S. H. and Bustamante, C. D. (2009). Inferring the joint demographic history of multiple populations from multidimensional SNP frequency data. PLoS Genetics, 5, e1000695.CrossRefGoogle ScholarPubMed
Harris, K. and Nielsen, R. (2013). Inferring demographic history from a spectrum of shared haplotype lengths. PLoS Genetics, 9, e1003521.CrossRefGoogle ScholarPubMed
Heled, J. and Drummond, A. J. (2010). Bayesian inference of species trees from multilocus data. Molecular Biology and Evolution, 27, 570–80.CrossRefGoogle ScholarPubMed
Hess, J. and Goldman, N. (2011). Addressing inter-gene heterogeneity in maximum likelihood phylogenomic analysis: yeasts revisited. PLoS One, 6, e22783.CrossRefGoogle ScholarPubMed
Hobolth, A., Christensen, O. F., Mailund, T. and Schierup, M. H. (2007). Genomic relationships and speciation times of human, chimpanzee, and gorilla inferred from a coalescent hidden Markov model. PLoS Genetics, 3, e7.CrossRefGoogle ScholarPubMed
Holland, B. R. (2004). Using consensus networks to visualize contradictory evidence for species phylogeny. Molecular Biology and Evolution, 21, 1459–61.CrossRefGoogle ScholarPubMed
Holland, B. R., Jarvis, P. D. and Sumner, J. G. (2012). Low-parameter phylogenetic inference under the general Markov model. Systematic Biology, 62, 78–92.Google ScholarPubMed
Horvath, J. E., Weisrock, D. W., Embry, S. L., et al. (2008). Development and application of a phylogenomic toolkit: resolving the evolutionary history of Madagascar's lemurs. Genome Research, 18, 489–99.CrossRefGoogle ScholarPubMed
Hunt, M., Newbold, C., Berriman, M. and Otto, T. D. (2014). A comprehensive evaluation of assembly scaffolding tools. Genome Biology, 15, R42.CrossRefGoogle ScholarPubMed
Iqbal, Z., Caccamo, M., Turner, I., Flicek, P. and McVean, G. (2012). De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics, 44, 226–32.CrossRefGoogle ScholarPubMed
Jeffroy, O., Brinkmann, H., Delsuc, F. and Philippe, H. (2006). Phylogenomics: the beginning of incongruence?Trends in Genetics, 22, 225–31.CrossRefGoogle ScholarPubMed
Jones, M. O., Koutsovoulos, G. D. and Blaxter, M. L. (2011). iPhy: an integrated phylogenetic workbench for supermatrix analyses. BMC Bioinformatics, 12, 30.CrossRefGoogle ScholarPubMed
Kao, R. R., Haydon, D. T., Lycett, S. J. and Murcia, P. R. (2014). Supersize me: how whole-genome sequencing and big data are transforming epidemiology. Trends in Microbiology, 22, 282–91.CrossRefGoogle ScholarPubMed
Koren, S., Harhay, G. P., Smith, T. P., et al. (2013). Reducing assembly complexity of microbial genomes with single-molecule sequencing. Genome Biology, 14, R101.CrossRefGoogle ScholarPubMed
Kubatko, L. S., Carstens, B. C. and Knowles, L. L. (2009). STEM: species tree estimation using maximum likelihood for gene trees under coalescence. Bioinformatics, 25, 971–3.CrossRefGoogle ScholarPubMed
Kumar, S., Filipski, A. J., Battistuzzi, F. U., Kosakovsky Pond, S. L. and Tamura, K. (2012). Statistics and truth in phylogenomics. Molecular Biology and Evolution, 29, 457–72.CrossRefGoogle ScholarPubMed
Landan, G. and Graur, D. (2007). Heads or tails: a simple reliability check for multiple sequence alignments. Molecular Biology and Evolution, 24, 1380–3.CrossRefGoogle ScholarPubMed
Lanfear, R., Calcott, B., Ho, S. Y. W. and Guindon, S. (2012). PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Molecular Biology and Evolution, 29, 1695–701.CrossRefGoogle ScholarPubMed
Lartillot, N. and Philippe, H. (2004). A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Molecular Biology and Evolution, 21, 1095–109.CrossRefGoogle ScholarPubMed
Latreille, P., Norton, S., Goldman, B. S., et al. (2007). Optical mapping as a routine tool for bacterial genome sequence finishing. BMC Genomics, 8, 321.CrossRefGoogle ScholarPubMed
Lee, E. K., Cibrian-Jaramillo, A., Kolokotronis, S.-O., et al. (2011). A functional phylogenomic view of the seed plants. PLoS Genetics, 7, e1002411.CrossRefGoogle ScholarPubMed
Lemmon, A. R., Brown, J. M., Stanger-Hall, K. and Lemmon, E. M. (2009). The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and Bayesian inference. Systematic Biology, 58, 130–45.CrossRefGoogle ScholarPubMed
Lemmon, A. R., Emme, S. A. and Lemmon, E. M. (2012). Anchored hybrid enrichment for massively high-throughput phylogenomics. Systematic Biology, 61, 727–44.CrossRefGoogle ScholarPubMed
Lemmon, E. M. and Lemmon, A. R. (2013). High-throughput genomic data in systematics and phylogenetics. Annual Review of Ecology, Evolution, and Systematics, 44, 99–121.CrossRefGoogle Scholar
Li, H. and Durbin, R. (2011). Inference of human population history from individual whole-genome sequences. Nature, 475, 493–6.CrossRefGoogle ScholarPubMed
Li, H., Handsaker, B., Wysoker, A. et al. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25, 2078–9.CrossRefGoogle ScholarPubMed
Li, H. and Homer, N. (2010). A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics, 11, 473–83.CrossRefGoogle ScholarPubMed
Li, L., Stoeckert, C. J. and Roos, D. S. (2003). OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Research, 13, 2178–89.CrossRefGoogle ScholarPubMed
Li, R., Zhu, H., Ruan, J., et al. (2010). De novo assembly of human genomes with massively parallel short read sequencing. Genome Research, 20, 265–72.CrossRefGoogle ScholarPubMed
Liu, K., Raghavan, S., Nelesen, S., Linder, C. R. and Warnow, T. (2009). Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science, 324, 1561–4.CrossRefGoogle ScholarPubMed
Liu, L., Yu, L., Kubatko, L., Pearl, D. K. and Edwards, S. V. (2009). Coalescent methods for estimating phylogenetic trees. Molecular Phylogenetics and Evolution, 53, 320–8.CrossRefGoogle ScholarPubMed
Löytynoja, A. and Goldman, N. (2005). An algorithm for progressive multiple alignment of sequences with insertions. Proceedings of the National Academy of Sciences of the United States of America, 102, 10557–62.CrossRefGoogle ScholarPubMed
Löytynoja, A. and Milinkovitch, M. C. (2001). SOAP: cleaning multiple alignments from unstable blocks. Bioinformatics, 17, 573–4.CrossRefGoogle ScholarPubMed
Maddison, W. and Knowles, L. (2006). Inferring phylogeny despite incomplete lineage sorting. Systematic Biology, 55, 21–30.CrossRefGoogle ScholarPubMed
Mallatt, J. M., Garey, J. R. and Shultz, J. W. (2004). Ecdysozoan phylogeny and Bayesian inference: first use of nearly complete 28S and 18S rRNA gene sequences to classify the arthropods and their kin. Molecular Phylogenetics and Evolution, 31, 178–91.CrossRefGoogle ScholarPubMed
Mamanova, L., Coffey, A. J., Scott, C. E., et al. (2010). Target-enrichment strategies for next-generation sequencing. Nature Methods, 7, 111–18.Google ScholarPubMed
Manske, M., Miotto, O., Campino, S., et al. (2012). Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature, 487, 375–9.CrossRefGoogle ScholarPubMed
McCormack, J. E., Hird, S. M., Zellmer, A. J., Carstens, B. C. and Brumfield, R. T. (2013). Applications of next-generation sequencing to phylogeography and phylogenetics. Molecular Phylogenetics and Evolution, 66, 526–38.CrossRefGoogle ScholarPubMed
McVean, G. A. T. and Cardin, N. J. (2005). Approximating the coalescent with recombination. Philosophical Transactions of the Royal Society B-Biological Sciences, 360, 1387–93.CrossRefGoogle ScholarPubMed
Medvedev, P., Stanciu, M. and Brudno, M. (2009). Computational methods for discovering structural variation with next-generation sequencing. Nature Methods, 6, S13–S20.CrossRefGoogle ScholarPubMed
Miller, J. R., Koren, S. and Sutton, G. (2010). Assembly algorithms for next-generation sequencing data. Genomics, 95, 315–27.CrossRefGoogle ScholarPubMed
Morrison, D. A. and Ellis, J. T. (1997). Effects of nucleotide sequence alignment on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. Molecular Biology and Evolution, 14, 428–41.CrossRefGoogle ScholarPubMed
Mullikin, J. C. and Ning, Z. (2003). The Phusion Assembler. Genome Research, 13, 81–90.CrossRefGoogle ScholarPubMed
Nguyen-Dumont, T., Pope, B. J., Hammet, F., Southey, M. C. and Park, D. J. (2013). A high-plex PCR approach for massively parallel sequencing. BioTechniques, 55, 69–74.CrossRefGoogle ScholarPubMed
Nichols, R. (2001). Gene trees and species trees are not the same. Trends in Ecology and Evolution, 16, 358–64.CrossRefGoogle Scholar
Nielsen, R., Hellmann, I., Hubisz, M., Bustamante, C. and Clark, A. G. (2007). Recent and ongoing selection in the human genome. Nature Reviews Genetics, 8, 857–68.CrossRefGoogle ScholarPubMed
Nielsen, R., Paul, J. S., Albrechtsen, A. and Song, Y. S. (2011). Genotype and SNP calling from next-generation sequencing data. Nature Reviews Genetics, 12, 443–51.CrossRefGoogle ScholarPubMed
Nosenko, T., Schreiber, F., Adamska, M., et al. (2013). Deep metazoan phylogeny: when different genes tell different stories. Molecular Phylogenetics and Evolution, 67, 223–33.CrossRefGoogle ScholarPubMed
Nylander, J. A. A., Ronquist, F., Huelsenbeck, J. P. and Nieves-Aldrey, J.-L. (2004). Bayesian phylogenetic analysis of combined data. Systematic Biology, 53, 47–67.CrossRefGoogle ScholarPubMed
Ogden, T. H. and Rosenberg, M. S. (2006). Multiple sequence alignment accuracy and phylogenetic inference. Systematic Biology, 55, 314–28.CrossRefGoogle ScholarPubMed
Page, R. D. and Charleston, M. A. (1997). From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molecular Phylogenetics and Evolution, 7, 231–40.CrossRefGoogle ScholarPubMed
Pagel, M. and Meade, A. (2004). A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Systematic Biology, 53, 571–81.CrossRefGoogle ScholarPubMed
Parkhill, J. (2002). The importance of complete genome sequences. Trends in Microbiology, 10, 219–20; author reply 220.CrossRefGoogle ScholarPubMed
Penny, D., McComish, B. J., Charleston, M. A. and Hendy, M. D. (2014). Mathematical elegance with biochemical realism: the covarion model of molecular evolution. Journal of Molecular Evolution, 53, 711–23.Google Scholar
Perkel, J. (2008). SNP genotyping: six technologies that keyed a revolution. Nature Methods, 5, 447–53.Google Scholar
Philip, G. K., Creevey, C. J. and McInerney, J. O. (2005). The Opisthokonta and the Ecdysozoa may not be clades: stronger support for the grouping of plant and animal than for animal and fungi and stronger support for the Coelomata than Ecdysozoa. Molecular Biology and Evolution, 22, 1175–84.CrossRefGoogle Scholar
Philippe, H., Delsuc, F., Brinkmann, H. and Lartillot, N. (2005a). Phylogenomics. Annual Review of Ecology, Evolution, and Systematics, 36, 541–62.CrossRefGoogle Scholar
Philippe, H., Lartillot, N. and Brinkmann, H. (2005b). Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Molecular Biology and Evolution, 22, 1246–53.CrossRefGoogle ScholarPubMed
Phillips, M. J. (2004). Genome-scale phylogeny and the detection of systematic biases. Molecular Biology and Evolution, 21, 1455–8.CrossRefGoogle ScholarPubMed
Pisani, D. (2004). Identifying and removing fast-evolving sites using compatibility analysis: an example from the Arthropoda. Systematic Biology, 53, 978–89.CrossRefGoogle ScholarPubMed
Pisani, D., Cotton, J. A. and McInerney, J. O. (2007). Supertrees disentangle the chimerical origin of eukaryotic genomes. Molecular Biology and Evolution, 24, 1752–60.CrossRefGoogle ScholarPubMed
Pons, J., Barraclough, T., Gómez-Zurita, J., et al. (2006). Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Systematic Biology, 55, 595–609.CrossRefGoogle ScholarPubMed
Pool, J. E., Hellmann, I., Jensen, J. D. and Nielsen, R. (2010). Population genetic inference from genomic sequence variation. Genome Research, 20, 291–300.CrossRefGoogle ScholarPubMed
Posada, D. and Buckley, T. (2004). Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Systematic Biology, 53, 793–808.CrossRefGoogle ScholarPubMed
Qiu, Y.-L., Li, L., Wang, B., et al. (2006). The deepest divergences in land plants inferred from phylogenomic evidence. Proceedings of the National Academy of Sciences of the United States of America, 103, 15511–16.CrossRefGoogle ScholarPubMed
Quinlan, A. R. and Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26, 841–2.CrossRefGoogle ScholarPubMed
Rannala, B. and Yang, Z. (2003). Bayes estimation of species divergence times and ancestral population sizes using DNA sequences from multiple loci. Genetics, 164, 1645–56.Google ScholarPubMed
Rannala, B. and Yang, Z. (2008). Phylogenetic inference using whole genomes. Annual Review of Genomics and Human Genetics, 9, 217–31.CrossRefGoogle ScholarPubMed
Rhaesa, A. S., Bartolomaeus, T., Lemburg, C., Ehlers, U. and Garey, J. R. (1998). The position of the Arthropoda in the phylogenetic system. Journal of Morphology, 238, 263–85.Google Scholar
Rodríguez-Ezpeleta, N., Brinkmann, H., Roure, B., et al. (2007). Detecting and overcoming systematic errors in genome-scale phylogenies. Systematic Biology, 56, 389–99.CrossRefGoogle ScholarPubMed
Rokas, A., Williams, B. L., King, N. and Carroll, S. B. (2003). Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature, 425, 798–804.CrossRefGoogle ScholarPubMed
Rosenberg, M. S., ed. (2011). Sequence Alignment: Methods, Models, Concepts, and Strategies. Oakland, CA, University of California Press.Google Scholar
Rosenberg, N. A. and Nordborg, M. (2002). Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nature Reviews Genetics, 3, 380–90.CrossRefGoogle ScholarPubMed
Roth, A. C., Gonnet, G. H. and Dessimoz, C. (2009). Algorithm of OMA for large-scale orthology inference. BMC Bioinformatics, 10, 220.CrossRefGoogle Scholar
Roure, B., Baurain, D. and Philippe, H. (2012). Impact of missing data on phylogenies inferred from empirical phylogenomic data sets. Molecular Biology and Evolution, 30, 197–214.Google ScholarPubMed
Salichos, L. and Rokas, A. (2014). Inferring ancient divergences requires genes with strong phylogenetic signals. Nature, 497, 327–31.Google Scholar
Sankoff, D., Morel, C. and Cedergren, R. J. (1973). Evolution of 5S RNA and the non-randomness of base replacement. Nature New Biology, 245, 232–4.CrossRefGoogle ScholarPubMed
Scheinfeldt, L. B. and Tishkoff, S. A. (2013). Recent human adaptation: genomic approaches, interpretation and insights. Nature Reviews Genetics, 14, 692–702.CrossRefGoogle ScholarPubMed
Schiffels, S. and Durbin, R. (2014). Inferring human population size and separation history from multiple genome sequences. Nature Genetics, 46, 919–25.CrossRefGoogle ScholarPubMed
Schneeberger, K., Ossowski, S., Ott, F., et al. (2011). Reference-guided assembly of four diverse Arabidopsis thaliana genomes. Proceedings of the National Academy of Sciences of the United States of America, 108, 10249–54.CrossRefGoogle ScholarPubMed
Scholtz, G. (2002). The Articulata hypothesis – or what is a segment?Organisms Diversity and Evolution, 2, 197–215.CrossRefGoogle Scholar
Shapiro, B. (2005). Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences. Molecular Biology and Evolution, 23, 7–9.Google ScholarPubMed
Simpson, J. T. and Durbin, R. (2010). Efficient construction of an assembly string graph using the FM-Index. Bioinformatics, 26, i367–73.CrossRefGoogle ScholarPubMed
Simpson, J. T. and Durbin, R. (2012). Efficient de novo assembly of large genomes using compressed data structures. Genome Research, 22, 549–56.CrossRefGoogle ScholarPubMed
Simpson, J. T., Wong, K., Jackman, S. D., et al. (2009). ABySS: a parallel assembler for short read sequence data. Genome Research, 19, 1117–23.CrossRefGoogle ScholarPubMed
Smith, B. T., Harvey, M. G., Faircloth, B. C., Glenn, T. C. and Brumfield, R. T. (2013). Target capture and massively parallel sequencing of ultraconserved elements for comparative studies at shallow evolutionary time scales. Systematic Biology, 63, 83–95.Google ScholarPubMed
Sousa, V. and Hey, J. (2013). Understanding the origin of species with genome-scale data: modelling gene flow. Nature Reviews Genetics, 14, 404–14.CrossRefGoogle ScholarPubMed
Spang, A., Saw, J. H., Jørgensen, S. L., et al. (2015). Complex Archaea that bridge the gap between prokaryotes and eukaryotes. Nature, 521, 173–9.CrossRefGoogle ScholarPubMed
Stamatakis, A. (2014). RAxML Version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30, 1312–13.CrossRefGoogle ScholarPubMed
Stamatakis, A., Hoover, P. and Rougemont, J. (2008). A rapid bootstrap algorithm for the RAxML web servers. Systematic Biology, 57, 758–71.CrossRefGoogle ScholarPubMed
Steel, M. (2005). Should phylogenetic models be trying to “fit an elephant”?Trends in Genetics, 21, 307–9.CrossRefGoogle ScholarPubMed
Struck, T. H., Paul, C., Hill, N., et al. (2011). Phylogenomic analyses unravel annelid evolution. Nature, 471, 95–98.CrossRefGoogle ScholarPubMed
Suchard, M. A. and Rambaut, A. (2009). Many-core algorithms for statistical phylogenetics. Bioinformatics, 25, 1370–76.CrossRefGoogle ScholarPubMed
Swain, M. T., Tsai, I. J., Assefa, S. A., et al. (2012). A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs. Nature Protocols, 7, 1260–84.CrossRefGoogle ScholarPubMed
Swofford, D. L., Olsen, G. J., Waddell, P. J. and Hillis, D. M. (1996). Phylogenetic inference. In Molecular Systematics, ed. Hillis, D. M., Moritz, C. and Mable, B. K.. Sunderland, MA, Sinauer Associates; pp. 407–515.Google Scholar
Szöllősi, G. J., Tannier, E., Daubin, V. and Boussau, B. (2015). The inference of gene trees with species trees. Systematic Biology, 64, e42–e62.CrossRefGoogle ScholarPubMed
Taylor, D. J. (2004). An assessment of accuracy, error, and conflict with support values from genome-scale phylogenetic data. Molecular Biology and Evolution, 21, 1534–7.CrossRefGoogle ScholarPubMed
Telford, M. J., Bourlat, S. J., Economou, A., Papillon, D. and Rota-Stabelli, O. (2008). The evolution of the Ecdysozoa. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences, 363, 1529–37.CrossRefGoogle ScholarPubMed
Tewhey, R., Warner, J. B., Nakano, M., et al. (2009). Microdroplet-based PCR enrichment for large-scale targeted sequencing. Nature Biotechnology, 27, 1025–31.CrossRefGoogle ScholarPubMed
The 1000 Genomes Project Consortium (2013). An integrated map of genetic variation from 1,092 human genomes. Nature, 490, 56–65.
Thompson, J. D., Linard, B., Lecompte, O. and Poch, O. (2011). A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives. PLoS One, 6, e18093.CrossRefGoogle ScholarPubMed
Thompson, J. F. and Milos, P. M. (2011). The properties and applications of single-molecule DNA sequencing. Genome Biology, 12, 217.Google ScholarPubMed
Timme, R. E., Bachvaroff, T. R. and Delwiche, C. F. (2012). Broad phylogenomic sampling and the sister lineage of land plants. PLoS One, 7, e29696.CrossRefGoogle ScholarPubMed
Treangen, T. J. and Salzberg, S. L. (2011). Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nature Reviews Genetics, 13, 36–46.Google ScholarPubMed
Trivedi, U. H. (2014). Quality control of next-generation sequencing data without a reference. Frontiers in Genetics, 5, 111.CrossRefGoogle ScholarPubMed
Turner, E. H., Ng, S. B., Nickerson, D. A. and Shendure, J. (2009). Methods for genomic partitioning. Annual Review of Genomics and Human Genetics, 10, 263–84.CrossRefGoogle ScholarPubMed
Vilella, A. J., Severin, J., Ureta-Vidal, A., et al. (2008). EnsemblCompara genetrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Research, 19, 327–35.CrossRefGoogle ScholarPubMed
Vitti, J. J., Grossman, S. R. and Sabeti, P. C. (2013). Detecting natural selection in genomic data. Annual Review of Genetics, 47, 97–120.CrossRefGoogle ScholarPubMed
Watson, M. (2014). Quality assessment and control of high-throughput sequencing data. Frontiers in Genetics, 5, 235.CrossRefGoogle ScholarPubMed
Westesson, O., Barquist, L. and Holmes, I. (2012). HandAlign: Bayesian multiple sequence alignment, phylogeny and ancestral reconstruction. Bioinformatics, 28, 1170–1.CrossRefGoogle ScholarPubMed
Wheeler, W. C. and Gladstein, D. S. (1994). MALIGN: a multiple sequence alignment program. Journal of Heredity, 85, 417–18.CrossRefGoogle Scholar
Whelan, N. V., Kocot, K. M., Moroz, L. L. and Halanych, K. M. (2015). Error, signal, and the placement of Ctenophora sister to all other animals. Proceedings of the National Academy of Sciences of the United States of America, 112, 5773–8.CrossRefGoogle ScholarPubMed
Whelan, S. (2008). Spatial and temporal heterogeneity in nucleotide sequence evolution. Molecular Biology and Evolution, 25, 1683–94.CrossRefGoogle ScholarPubMed
Whitelaw, C. A., Barbazuk, W. B., Pertea, G., et al. (2003). Enrichment of gene-coding sequences in maize by genome filtration. Science, 302, 2118–20.CrossRefGoogle ScholarPubMed
Wiegmann, B. M., Trautwein, M. D., Winkler, I. S., et al. (2011). Episodic radiations in the fly tree of life. Proceedings of the National Academy of Sciences of the United States of America, 108, 5690–5.CrossRefGoogle ScholarPubMed
Wilkinson, M. (2006). Identifying stable reference taxa for phylogenetic nomenclature. Zoologica Scripta, 35, 109–12.CrossRefGoogle Scholar
Williams, T. A., Foster, P. G., Cox, C. J. and Embley, T. M. (2014). An archaeal origin of eukaryotes supports only two primary domains of life. Nature, 504, 231–6.Google Scholar
Williams, T. A., Foster, P. G., Nye, T. M. W., Cox, C. J. and Embley, T. M. (2012). A congruent phylogenomic signal places eukaryotes within the Archaea. Proceedings of the Royal Society B – Biological Sciences, 279, 4870–9.CrossRefGoogle ScholarPubMed
Wong, K. M., Suchard, M. A. and Huelsenbeck, J. P. (2008). Alignment uncertainty and genomic analysis. Science, 319, 473–6.CrossRefGoogle ScholarPubMed
Wood, D. E. and Salzberg, S. L. (2014). Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biology, 15, R46.CrossRefGoogle ScholarPubMed
Wu, M., Chatterji, S. and Eisen, J. A. (2012). Accounting for alignment uncertainty in phylogenomics. PLoS One, 7, e30288.CrossRefGoogle ScholarPubMed
Wu, M. and Eisen, J. A. (2008). A simple, fast, and accurate method of phylogenomic inference. Genome Biology, 9, R151.CrossRefGoogle ScholarPubMed
Yalcin, B., Adams, D. J., Flint, J. and Keane, T. M. (2012). Next-generation sequencing of experimental mouse strains. Mammalian Genome, 23, 490–8.CrossRefGoogle ScholarPubMed
Yang, Z. (1996a). Maximum-likelihood models for combined analyses of multiple sequence data. Journal of Molecular Evolution, 42, 587–96.CrossRefGoogle ScholarPubMed
Yang, Z. (1996b). Among-site rate variation and its impact on phylogenetic analyses. Trends in Ecology and Evolution, 11, 367–72.CrossRefGoogle ScholarPubMed
Zerbino, D. R. and Birney, E. (2008). Velvet: algorithms for de novo short read assembly using De Bruijn graphs. Genome Research, 18, 821–9.CrossRefGoogle Scholar
Zhou, X. and Rokas, A. (2014). Prevention, diagnosis and treatment of high-throughput sequencing data pathologies. Molecular Ecology, 23, 1679–700.CrossRefGoogle ScholarPubMed

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×