Book contents
- Frontmatter
- Contents
- Preface
- 1 The Molecular Biology Data Explosion
- 2 Introduction to Genome Browsing with the UCSC Genome Browser
- 3 Browsing with Ensembl, MapViewer, and Other Genome Browsers
- 4 Interactive Genome-Database Batch Querying
- 5 Interactive Batch Post-Processing with Galaxy
- 6 Introduction to Programmed Querying
- 7 Using the Ensembl API
- 8 Programmed Querying with Ensembl, Continued
- 9 Introduction to the UCSC API
- 10 More Advanced Applications Using the UCSC API
- 11 Customized Genome Databases
- 12 Genomes, Browsers, Databases – The Future
- Appendix 1 Coordinate System Conventions
- Appendix 2 Genome Data Formats
- Appendix 3 UCSC Table Formats
- Appendix 4 Genomic Sequence Alignments
- Appendix 5 Program Code README File
- Appendix 6 Selected General References for Genome Databases and Browsers
- Appendix 7 Online Documentation and Useful Web Sites for Genome Databases and Browsers
- Appendix 8 Glossary of Biological and Computer Terms Used in the Text
- References
- Index
Preface
Published online by Cambridge University Press: 14 May 2010
- Frontmatter
- Contents
- Preface
- 1 The Molecular Biology Data Explosion
- 2 Introduction to Genome Browsing with the UCSC Genome Browser
- 3 Browsing with Ensembl, MapViewer, and Other Genome Browsers
- 4 Interactive Genome-Database Batch Querying
- 5 Interactive Batch Post-Processing with Galaxy
- 6 Introduction to Programmed Querying
- 7 Using the Ensembl API
- 8 Programmed Querying with Ensembl, Continued
- 9 Introduction to the UCSC API
- 10 More Advanced Applications Using the UCSC API
- 11 Customized Genome Databases
- 12 Genomes, Browsers, Databases – The Future
- Appendix 1 Coordinate System Conventions
- Appendix 2 Genome Data Formats
- Appendix 3 UCSC Table Formats
- Appendix 4 Genomic Sequence Alignments
- Appendix 5 Program Code README File
- Appendix 6 Selected General References for Genome Databases and Browsers
- Appendix 7 Online Documentation and Useful Web Sites for Genome Databases and Browsers
- Appendix 8 Glossary of Biological and Computer Terms Used in the Text
- References
- Index
Summary
The idea behind this book developed in late 2004–early 2005 while I was working on two unrelated projects in computational genomics. The first project involved the computational detection of small nucleolar RNAs (snoRNAs) in genome sequences. In the course of this work, I noticed – as others had, as well – that, in mammals, snoRNA genes are located within introns of protein-coding genes (so-called snoRNA host genes), which are often genes that code for ribosomal proteins. This observation led to speculation as to whether therewere additionalcommonfeatures of the introns and genes that contain snoRNAs. For example, are the host genes of homologous mammalian snoRNAs themselves homologous? Do those host genes have other shared functions beyond the fact that several of them code for ribosomal proteins? Are the introns that contain the snoRNAs consistently longer (or shorter) than the average introns found in these genes? Are the snoRNAs found at any characteristic distance from the nearest exon-intron junctions in their host gene? To answer these questions would require accessing sequence and annotation data for both the human and mouse genomes and performing some simple calculations and statistics on that data. Moreover, because therewere some200humansnoRNAs already known(and a similar number of mouse snoRNAs), performing this data acquisition and manipulation would require computer processing.
- Type
- Chapter
- Information
- Genomes, Browsers and DatabasesData-Mining Tools for Integrated Genomic Databases, pp. ix - xiiPublisher: Cambridge University PressPrint publication year: 2008