Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-nr4z6 Total loading time: 0 Render date: 2024-05-01T12:19:40.162Z Has data issue: false hasContentIssue false

17 - High-throughput sequence analyses

Published online by Cambridge University Press:  05 February 2015

Tim J. Stevens
Affiliation:
MRC Laboratory of Molecular Biology, Cambridge
Wayne Boucher
Affiliation:
University of Cambridge
Get access

Summary

High-throughput sequencing

Given the decreasing cost required to determine the sequence of nucleic acids, sequencing is used in increasingly wider contexts. Rather than only determining the genome sequence of an organism, high-throughput techniques allow researchers to investigate much more, such as the variation within individuals of a population, the amount of expression of individual genes in a given sample (e.g. by detecting RNAs) and the sequences which are bound to particular protein components. A sequencing run on one of the latest-generation sequencing machines may generate many gigabases (>109 bp) of data and so much of the task for bioinformatics is to make sense of the raw sequence data: to put it into a genomic, biological context. For organisms with a known genomic sequence the primary task when processing high-throughput sequence data is to simply map relatively short bits of sequence called ‘reads’ that come from the sequencing machine to a reference genome. Only then can the detected sequences be understood. By mapping newly acquired sequences on to the known chromosomes the whole database of information that annotates the genome, such as the position of genes and regulatory sequences, indicates which DNA features were detected. In this chapter we will give an introduction to various basic computational procedures involving high-throughput sequence data which can be achieved, or at least handled, using Python. Because this is a vast and rapidly expanding subject we can only lightly touch on the core concepts here, though hopefully we have provided solid starting points for further development.

Type
Chapter
Information
Python Programming for Biology
Bioinformatics and Beyond
, pp. 341 - 360
Publisher: Cambridge University Press
Print publication year: 2015

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Cock, P.J., Fields, C.J., Goto, N., Heuer, M.L., and Rice, P.M. (2010). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research 38(6): 1767–1771CrossRefGoogle ScholarPubMed
Langmead, B., Trapnell, C., Pop, M., and Salzberg, S.L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10(3): R25CrossRefGoogle ScholarPubMed
Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler Transform. Bioinformatics 25: 1754–1760CrossRefGoogle ScholarPubMed
Li, H., Handsaker, B., Wysoker, A., et al.; 1000 Genome Project Data Processing Subgroup (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics 25(16): 2078–2079CrossRefGoogle ScholarPubMed

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×