Genomics, the newest branch of genetics, is the study of genome structure and function: massive genome-wide mapping, determination of primary nucleotide sequence for whole genomes, analysis of spatial relationships of various sequences or classes of sequence within and between chromosomes, genomic inventory by the sequence or gene class, and global analysis of gene expression. Genomics emphasizes genes over nontranscribed, nonregulatory sequences. A major challenge in genomics is the analysis of very large amounts of information.
The first step in genomic analysis is construction of a fully representative, high-quality genomic library. A large, pure sample of the life form of interest is collected and treated physically to separate genomic DNA from other components of the life form. The DNA is extracted chemically, purified, and cleaved. The fragments are cloned in a suitable vector, commonly cosmid, bacteriophage P1, BAC, or YAC (Chapter 27). To ensure that the library contains overlapping clones that span the entire genome, the DNA is digested partially, and the cloned segments comprise a large random sample, typically an average of 10 to 30 copies per sequence. A set of overlapping, cloned, sequenced DNA segments is called a contig, because the sequence of the region spanned by the segments has no gaps (Figure 30.1). In genome sequencing, it is ideal to render each chromosome a contig – an array of fragments covering the chromosome's entire DNA molecule.