The design of association studies is critically dependent upon the extent of linkage disequilibrium (LD) across different genomic regions, often summarised in terms of the mean absolute value of summary linkage disequilibrium measures. The two most commonly used measures are D′ for estimating the magnitude or extent of LD, and Δ which is directly proportional to the power of LD mapping.
We studied the sampling distribution of the mean of [mid ]Dˆ′[mid ] and [mid ]Δˆ[mid ] statistics for varying sample size and major allele frequencies. When the sample size is small or one allele frequency is extreme, estimates of the magnitude of association based on the mean of [mid ]Dˆ′[mid ] can be substantially inflated. This inflation is more marked when the haplotype frequencies have been inferred from genotype counts. The net effect of this means that smaller studies will tend to show higher levels of LD. The magnitude of this inflation can be reduced by use of a bootstrap correction, and by avoiding using markers with extreme allele frequencies. In contrast, the [mid ]Δˆ[mid ] statistic is much less affected by sample size and high major allele frequencies. These effects are illustrated with real data on 36 SNPs typed in an Ashkenasi Jewish population.