Hostname: page-component-8448b6f56d-xtgtn Total loading time: 0 Render date: 2024-04-18T01:47:19.793Z Has data issue: false hasContentIssue false

Identification of the sources of error in allele frequency estimations from pooled DNA indicates an optimal experimental design

Published online by Cambridge University Press:  13 December 2002

B. J. BARRATT
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
F. PAYNE
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
H. E. RANCE
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
S. NUTLAND
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
J. A. TODD
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
D. G. CLAYTON
Affiliation:
JDRF/WT Diabetes and Inflammation Laboratory, Cambridge Institute for Medical Research, University of Cambridge, Wellcome Trust/MRC Building, Hills Road, Cambridge, CB2 2XY, UK
Get access

Abstract

Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss of statistical power. However, there has been no systematic attempt to quantify the several sources of error in previous studies. We report an analysis of the magnitude of variance components of each experimental stage in DNA pooling studies, and find that a design based on the formation of numerous small pools of approximately 50 individuals is superior to the formation of fewer, larger pools and the replication of any of the experimental stages. We conclude that this approach may retain an effective sample size greater than 68% of the true sample size, whilst offering a 60-fold reduction in DNA usage and a greater than 30-fold saving in cost, compared to individual genotyping. The possibility of combining pooling with informed selection of haplotype tag SNPs is also considered. In this way further savings in efficiency may be possible by using pooled allele frequency estimates to infer haplotype frequencies and hence, allele frequencies at untyped markers.

Type
Research Article
Copyright
© University College London 2002

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)