Skip to main content Accessibility help
×
Home

Information:

  • Access
  • Cited by 9

Actions:

      • Send article to Kindle

        To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

        Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

        Find out more about the Kindle Personal Document Service.

        An algorithmic model for constructing a linkage and linkage disequilibrium map in outcrossing plant populations
        Available formats
        ×

        Send article to Dropbox

        To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

        An algorithmic model for constructing a linkage and linkage disequilibrium map in outcrossing plant populations
        Available formats
        ×

        Send article to Google Drive

        To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

        An algorithmic model for constructing a linkage and linkage disequilibrium map in outcrossing plant populations
        Available formats
        ×
Export citation

Summary

A linkage–linkage disequilibrium map that describes the pattern and extent of linkage dis-equilibrium (LD) decay with genomic distance has now emerged as a viable tool to unravel the genetic structure of population differentiation and fine-map genes for complex traits. The prerequisite for constructing such a map is the simultaneous estimation of the linkage and LD between different loci. Here, we develop a computational algorithm for simultaneously estimating the recombination fraction and LD in a natural outcrossing population with multilocus marker data, which are often estimated separately in most molecular genetic studies. The algorithm is founded on a commonly used progeny test with open-pollinated offspring sampled from a natural population. The information about LD is reflected in the co-segregation of alleles at different loci among parents in the population. Open mating of parents will reveal the genetic linkage of alleles during meiosis. The algorithm was constructed within the polynomial-based mixture framework and implemented with the Expectation–Maximization (EM) algorithm. The by-product of the derivation of this algorithm is the estimation of outcrossing rate, a parameter useful to explore the genetic diversity of the population. We performed computer simulation to investigate the influences of different sampling strategies and different values of parameters on parameter estimation. By providing a number of testable hypotheses about population genetic parameters, this algorithmic model will open a broad gateway to understand the genetic structure and dynamics of an outcrossing population under natural selection.

1. Introduction

Linkage disequilibrium (LD) refers to the non-random association of alleles at different loci in the genome. Historically, LD analysis was developed to study the genetic structure and diversity of natural populations (Lewontin, 1964; Hill, 1974; Hedrick, 1987; Weir, 1996). In recent years, there has been a dramatic increase of interest in utilizing LD to infer the evolutionary history and process of human populations (Reich et al., 2001; Ardlie et al., 2002; Dawson et al., 2002; Gabriel et al., 2002) and to identify genes for disease or yield traits by association analyses with DNA-based markers (Remington et al., 2001; Ardlie et al., 2002). The efficacy of LD analysis in population genetic studies and gene mapping depends on the level of LD in the population studied, its distribution and heterogeneity across the genome, and its relationship with genetic or geographic distances (Farnir et al., 2000; McRae et al., 2002; Liu et al., 2006). The pattern of LD decay is well known in human populations, where the relationship between the LD and physical distance is graphically elucidated to infer the origin and evolution of humans (Tishkoff et al., 1996, 2001; Tishkoff & Williams, 2002).

The pattern of LD decay with genetic distance (measured in terms of the recombination fraction) can be used to characterize the genetic structure and dynamics of populations. This needs the simultaneous measures of the LD and recombination fraction between the same pair of loci. However, the estimation of these two parameters is usually based on different genetic designs; i.e. the estimation of the recombination fraction relies on a segregating pedigree, whereas the estimation of LD needs a random sample drawn from a natural population. More recently, several designs have been proposed to jointly measure the linkage and linkage disequilibrium for natural populations (Wu & Zeng, 2001) and domestic animals (Georges, 2007). Simultaneous estimation of LD and the recombination fraction can avoid false positive results (spurious LD) when LD is used to fine-map genes for complex traits given the frequent occurrence of LD between distantly spaced loci or unlinked loci.

Wu & Zeng (2001) proposed an open-pollinated (OP) design for population genetic studies of forest trees with molecular markers. For most forest tree species, seeds from a single mother tree are derived from the open pollination of unknown fathers from the pollen pool. By collecting OP seeds from a sample of individual trees in a natural population, Wu & Zeng's design did not take into account the hermaphroditical nature of a tree species in which both sexes exist on the same individual, and thus its seeds may be derived from both selfing and outcrossing pollination. Self-fertilization is thought to affect diversity by reduced effective population size and reduced genome-wide effective recombination rates, both due to increased homozygosity, elevated isolation among individuals and subpopulations induced by inbreeding (Charlesworth, 2003). Consequently, a predominantly selfing mode of reproduction may be expected to lead to low polymorphism, extensive LD and high population subdivision (Nordborg, 2000; Ingvarsson, 2002, 2005). These predictions can be tested by simultaneous estimation of the outcrossing rate, recombination fraction and LD. Also, the OP design proposed by Wu & Zeng (2001) did not incorporate a procedure for estimating the diplotype of heterozygous trees from which seeds are sampled. In this paper, we extend Wu & Zeng's OP progeny design to better understand the genetic structure of a natural population by simultaneously estimating multiple population genetic parameters with molecular markers. Simulation studies were performed to examine the statistical behaviour of the model.

Model

(i) Sampling and genotyping strategy

Suppose there is a natural population at Hardy–Weinberg equilibrium (HWE) for a dioecious plant species. Each plant in the population is OP by its own pollen (selfing) and randomly by the pollen from other individuals (outcrossing). Thus, seeds produced by each plant include a mix of offspring due to selfing and outcrossing pollination. We will randomly sample a set of maternal plants and further randomly collect a sample of seeds from each sampled plant. Because the fathers of seeds from a sampled maternal plant are unknown, this sampling strategy will generate a set of half-sib families. The collected seeds (embryos) are germinated into seedlings. DNA samples are taken from maternal plants and their offspring derived from the seeds for marker analysis.

A panel of molecular markers is typed to examine population genetic properties by estimating the recombination fraction, LD and outcrossing rate. Consider two markers, each with two alleles, 1 and 0, which are generally denoted by i for the first marker (i=1, 0) and j for the second marker (j=1, 0). Different alleles at each marker unite to form four gametes, whose frequencies in the population are expressed as

(1)

which sum to one, where p and 1−p are the frequencies of two alleles, 1 and 0, for the first marker, q and 1−q are the frequencies of two alleles for the second marker, and D is the degree of gametic LD between the two markers. It is assumed that there is no sex-specific difference in gamete frequencies, allele frequencies and LD in the population.

Among the sampled maternal plants, there are nine genotypes for the two markers considered, generally expressed as iijj′ (ii′=1, 0; jj′=1, 0). Let N iijj denote the number of maternal plants with genotype iijj′. Under the assumption of HWE, the frequency of a diplotype is the product of the frequencies of the gametes that form the diplotype. By collapsing those diplotypes that are observed as the same genotype, the frequencies of genotypes are generally expressed as

(2)

Table 1 gives the genotype frequencies of the maternal plants for the two markers in terms of haplotype or diplotype frequencies calculated with equation (2).

Table 1. Diplotype and genotype frequencies of two markers, A and B, in the offspring population through outcrossing and selfing pollination

The seeds collected from each sampled maternal plant are typed for the two markers so that the genotype of each offspring can be known. The same offspring genotype from the same maternal genotype are mixed up. Let be the mixed number of offspring with genotype llrr′ (ll′=1, 0; rr′=1, 0) collected from N iijj maternal plants with genotype iijj′. From observed offspring genotypes, we will estimate key population genetic parameters that define population structure and organization.

(ii) Offspring structure

Each sampled maternal plant undergoes meiosis to produce male and female gametes. For those double homozygotes at the two markers considered, only one gamete type is yielded. The plants which are heterozygous only for one marker produce two types of gametes with equal frequency. For the double heterozygote plants, there are four possible types of gametes: 11, 10, 01 and 00. This type of plant has two possible diplotypes 11|00 and 01|10, which will produce different gamete frequencies expressed as a function of the recombination fraction (r) (Table 2). Of all double heterozygotes, there is a relative proportion of

for diplotype 11|00, and

for diplotype 10|01.

Table 2. Two possible diplotypes of a maternal plant of double heterozygote and the frequencies of its four gametes for two markers

Each female gamete produced by a maternal plant unites at random with its own male gamete to form a selfing offspring, or with a gamete from the pollen pool to form an outcrossing offspring. Table 1 lists the frequencies of two-marker diplotypes (and therefore genotypes) in the selfing and outcrossing offspring populations produced by possible maternal genotypes. The pollen pool that contributes to the outcrossing seeds contains four male gametes, 11, 10, 01 and 00, whose frequencies are defined by p 11, p 10, p 01 and p 00, respectively. Let w be the outcrossing rate of the plant measured by the proportion of its offspring that are generated through fertilization by pollens of other plants in the population. Thus, the selfing rate of the plant that receives its own pollen to pollinate is .

Although marker genotypes of offspring sampled from a given maternal genotype can be observed, the mechanisms of genotype formation are unknown. The formation of progeny genotypes includes four mechanisms:

  1. (1) Pollination behaviour: The same progeny genotype can be derived from the selfing or outcrossing of a maternal plant. Let denote the overall frequency of offspring genotype llrr′ derived from maternal genotype iijj′, which is generally expressed, by considering all possible mechanisms of genotype formation, as

    (3)
    where and are the frequencies of offspring genotype llrr′ derived from maternal genotype iijj′ due to the maternal plant's selfing and outcrossing pollination, respectively (Table 1).

  2. (2) Sex origin of a gamete: The same progeny genotype may be due to reciprocal combinations between two gametes from male and female sides. For example, a maternal genotype 11/10 yields two female gametes, 11 and 10. When they unite with male gametes 10 and 11, respectively, the same progeny genotype results.

  3. (3) The complementarity of gametes: If two female gametes of a maternal genotype are complementary to those of the pollen pool, such combinations will produce the same outcrossing progeny genotype. For example, two gametes of maternal genotype 11/10, 11 and 10 are respectively combined with complementary male gametes from the pollen pool, 00 and 01, to generate the same progeny genotype 10/10.

  4. (4) Double heterozygote of a maternal plant: This type of plant contains two different diplotypes which produce the same arrays of gametes but with different relative proportions (see Table 2).

(iii) Likelihood and estimation

Based on the structure of offspring genotypes in Table 1, we construct a log likelihood for parameters Θ=(p 11, p 10, p 01, p 00, w, r) as

(4)

Likelihood (4) is implemented with the Expectation–Maximization (EM) algorithm to obtain the maximum likelihood estimates (MLEs) of Θ.

(a) Estimation of gamete frequencies

In the E step, calculate the expected numbers of gametes 11, 10, 01 and 00 within each observed offspring genotype derived from a maternal genotype, expressed as

(5)

Tables 3–6 provide the formulae for estimating the expected numbers of gametes 11, 10, 01 and 00, respectively. In the M step, estimate the gamete frequencies using

(6)

Table 3. The expected number () of gamete 11, within an offspring genotype derived from a maternal genotype. Note that the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype () and the overall frequency of the corresponding offspring genotype (), whereas for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Table 4. The expected number () of gamete 10, within an offspring genotype derived from a maternal genotype. Note that for the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype () and the overall frequency of the corresponding offspring genotype (), whereas for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Table 5. The expected number () of gamete 01, within an offspring genotype derived from a maternal genotype. Note that for the double heterozygote is obtained by dividing the expression in the table by both the frequency of the maternal genotype (Pii′jj′) and the overall frequency of the corresponding offspring genotype (), whereas for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Table 6. The expected number () of gamete 00, within an offspring genotype derived from a maternal genotype. Note that for the double heterozygote is obtained by dividing the expression in the table by both the frequency of maternal genotype (Pii′jj′) and the overall frequency of the corresponding offspring genotype (), whereas for all the genotypes is calculated by dividing the expression only by the overall frequency of the corresponding offspring genotypes

Estimation of the recombination fraction

In the E step, calculate the expected number of r within an offspring genotype derived from a maternal plant of double heterozygote using

(7)

In the M step, estimate the recombination fraction using

(8)

Estimation of outcrossing rate

In the E step, calculate the expected number of w within each possible offspring genotype derived from a maternal genotype using

(9)

In the M step, calculate the outcrossing rate using

(10)

Note that not all offspring genotypes contain w if they are not derived from a double heterozygote maternal plant (see Table 1), although the summation over all possible genotypes is generally given.

An iterative loop of E and M steps between equations (5), (7) and (9) and equations (6), (8) and (10) is constructed to estimate the parameters. After haplotype frequencies are estimated, allele frequencies at two markers and their LD (D) are estimated as

(iv) Hypothesis testing

The genetic parameters estimated, i.e. the LD (D), outcrossing rate (w) and recombination fraction (r), can be used to describe the genetic structure of a population. These parameters should be tested for their significance. The following hypotheses are formulated:

For each hypothesis, the likelihoods, and , are calculated, respectively, where the tilde corresponds to the MLEs for the null hypothesis and the hat corresponds to the MLEs for the alternative hypothesis. The log-likelihood ratio test statistic is then calculated by using

(11)

which is asymptotically χ2-distributed with one degree of freedom. The estimates of the parameters under each null hypothesis should be derived separately.

3. Computer simulation

(i) Design

Computer simulation was conducted to examine the statistical properties of the two-locus model for estimating the LD, outcrossing rate and recombination fraction between different molecular markers in a natural population. We consider a set of OP families randomly derived from a natural population, in which the genotype distribution of two given markers were simulated with their frequencies. The frequencies of two-locus genotypes (derived from diplotypes) are determined by gamete frequencies, specified by allele frequencies and LD in a population and the recombination fraction for a given maternal plant. We will consider the influences of different outcrossing rates (i.e. w=0·1 for low, 0·5 for medium and 0·9 for high), different recombination fractions (i.e. r=0·05 for a strong linkage and 0·25 for a weak linkage) and different linkage disequilibria (i.e. D=0·02 for strong independence and 0·10 for weak independence) on parameter estimation. We will consider all these possible combinations of parameter values.

To provide practical guidance on the use of this model, we simulate marker data with three different sampling strategies. A fixed number of samples (say 1000) can be allocated among and within OP families. We will use three sampling strategies: (1) small family number×large family size’ (10×100), (2) moderate family number×moderate family size (32×32), and large family number×small family size (100×10). Results under each of these strategies will be given.

(ii) Results

Tables 7–9 summarize the simulation results with different parameters and sampling strategies. In general, the model provides reasonable estimates of all parameters, although the accuracy and precision of parameter estimates depend on the values of these parameters, sampling strategies and interactions among all the factors. The estimation of the recombination fraction tends to prefer the ‘small family number×large family size’ sampling strategy (Table 7). It appears that the `large family number×small family size’ sampling strategy is favourable for the estimation of population genetic parameters including allele frequencies, LD and outcrossing rate (Table 9). The ‘moderate family number×moderate family size’ sampling strategy is somewhat in between (Table 8). In all the strategies, the estimation precision of allele frequencies and LD increases with increasing outcrossing rate. The recombination fraction can be estimated more precisely when the two markers are strongly linked. It is interesting to see that increasing LD leads to better estimation of the recombination fraction. As expected, increasing the outcrossing rate reduces the estimation precision of the recombination fraction. Especially, when outcrossing rate is very high (w=0·9), the recombination fraction will be poorly estimated for the two markers that are strongly independent.

Table 7. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (small family number×large family size) sampling strategy

Table 8. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (moderate family number×moderate family size) sampling strategy

Table 9. MLEs of parameters and their standard errors (in parentheses) obtained from 100 simulation replicates with the (large family number×small family size) sampling strategy

There is reasonably good power for detecting a significant LD and linkage between two markers, although such a power varies with the values of the parameters (results not shown). It seems that the power detection is not sensitive to sampling strategies. There are marked interactions in the power sensitivity between parameter values. The power of linkage detection decreases with increasing outcrossing rate, whereas the power of LD detection increases with increasing outcrossing rate.

4. Discussion

The past two decades have witnessed a dramatic increase of interest in molecular marker technologies and their applications to study the genetic structure of a natural population and map QTLs (quantitative trait loci) responsible for a quantitative trait (Reich et al., 2001; Ardlie et al., 2002; Dawson et al., 2002; Gabriel et al., 2002; reviewed in Georges 2007). In this paper, we have proposed an algorithmic approach for constructing the linkage–linkage disequilibrium map of a genome by genotyping a set of OP seeds sampled from a natural population. By estimating several key population genetic parameters, i.e. the relative occur-rence of selfing and outcrossing, LD, heterozygosity (estimated from allele frequencies) and recombination fraction, this approach will provide a tool for better understanding the pattern and organization of genetic variation in outcrossing populations. Furthermore, by elucidating the relationship between the linkage and LD in terms of the so-called LD map, the new algorithm can be used to infer the evolutionary history and process of natural populations and to identify genes for disease or yield traits (Remington et al., 2001; Ardlie et al., 2002; Farnir et al., 2000; McRae et al., 2002; Rafalski & Morgante, 2004).

The new approach capitalizes on the outcrossing nature of plants, allowing a certain proportion of selfing. Outcrossing is a common characteristic of many plants, including economically and ecologically important species like poplar, eucalyptus, pine and spruce (Butcher & Southerton, 2007; Miller & Schaal, 2006). This approach will find its immediate application in the genetic research of these important but understudied species. It has three significant advantages. First, it is simple and easily deployed in practice. By sampling and genotyping half-sib seeds from multiple maternal plants in a population, the approach provides the estimation of important population genetic parameters. Second, we derived a group of EM-based closed forms for parameter estimation for the OP-based sampling strategy, greatly facilitating the computing process of the parameters. The accuracy and precision of parameter estimation are affected by many factors including sample size and parameter range. A reasonable sampling strategy including the relative importance of family number and family size can be readily determined from simulation studies. Third, the approach allows the test of a number of meaningful hypotheses about the linkage, LD and outcrossing rate, providing a quantitative framework for understanding the genetic structure of a natural outcrossing population.

The approach can be extended to consider multiallelic markers and dominant markers. Unlike many annual crops, forest trees are still in wild or semi-wild conditions, in which there is a rich source of variation due to many alleles at a single gene. Multiallelic markers like microsatellites are a vital tool for the population genetic study of forest trees. On the other hand, for many underrepresented organisms, some economically cheap dominant markers are still useful although their informativeness is limited (Kuang et al., 1998; Silbiger et al., 1998; Kremer et al., 2005). When the OP-based sampling strategy considers multiallelic or dominant markers, new, more complicated algorithms need to be derived. Li et al. (2007) derived a model for the LD between dominant markers in a diploid population. Their model can be integrated with our OP-based sampling strategy to provide a comprehensive estimation of population genetic parameters with dominant markers. The computer code of the proposed algorithm is available from the corresponding author upon request.

This work was partially supported by Joint NSF/NIH grant number DMS/NIGMS-0540745 and NNSFC grant number 30771752.

References

Ardlie, K. G., Kruglyak, L. & Seielstad, M. (2002). Patterns of linkage disequilibrium in the human genome. Nature Reviews Genetics 3, 299309.
Butcher, P. A. & Southerton, S. (2007). Marker-assisted selection in forestry species. In Marker-assisted Selection: Current Status and Future Perspectives in Crops, Livestock, Forestry and Fish (ed. Guimaraes, E., Ruane, J., Scherf, B., Sonnino, A. and Dargie, J.), pp. 283–305, chapter 15. Rome: FAO.
Charlesworth, D. (2003). The effects of inbreeding on the genetic diversity of populations. Philosophical Transactions of the Royal Society of London, Series B, Biological Sciences 358, 10511070.
Dawson, E., Abecasis, G. R., Bumpstead, S., Chen, Y., Hunt, S., Beare, D. M., Pabial, J., Dibling, T., Tinsley, E., Kirby, S., Carter, D., Papaspyridonos, M., Livingstone, S., Ganske, R., Lohmmussaar, E., Zernant, J., Tonisson, N., Remm, M., Magi, R., Puurand, T., Vilo, J., Kurg, A., Rice, K., Deloukas, P., Mott, R., Metspalu, A., Bentley, D. R., Cardon, L. R. & Dunham, I. (2002). A first-generation linkage disequilibrium map of human chromosome 22. Nature 418, 544548.
Farnir, F., Coppieters, W., Arranz, J.-J., Berzi, P., Cambisano, N., Grisart, B., Karim, L., Marcq, F., Moreau, L., Mni, M., Nezer, C., Simon, P., Vanmanshoven, P., Wagenaar, D. & Georges, M. (2000). Extensive genome-wide linkage disequilibrium in cattle. Genome Research 10, 220227.
Gabriel, S. B., Schaffner, S. F., Nguyen, H., Moore, J. M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S. N., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E. S., Daly, M. J. & Altshuler, D. (2002). The structure of haplotype blocks in the human genome. Science 296, 22252229.
Georges, M. (2007). Mapping, fine mapping, and molecular dissection of quantitative trait loci in domestic animals. Annual Review of Genomics and Human Genetics 8, 131162.
Hedrick, P. W. (1987). Gametic disequilibrium measures: proceed with caution. Genetics 117, 331341.
Hill, W. G. (1974). Estimation of linkage disequilibrium in randomly mating populations. Heredity 33, 229239.
Ingvarsson, P. K. (2002). A metapopulation perspective of genetic diversity and differentiation in partially self-fertilizing plants. Evolution 56, 23682373.
Ingvarsson, P. K. (2005). Nucleotide polymorphism and linkage disequilibrium within and among natural populations of European aspen (Populus tremula L., Salicaceae). Genetics 169, 945953.
Kremer, A., Caron, H., Cavers, S., Colpaert, N., Gheysen, G., Gribe, R., Lemes, M., Lowe, A. J., Margis, R., Navarro, C. & Salgueiro, F. (2005). Monitoring genetic diversity in tropical trees with multilocus dominant markers. Heredity 95, 274280.
Kuang, H., Richardson, T. E., Carson, S. D. & Bongarten, B. C. (1998). An allele responsible for seedling death in Pinus radiata D. Don. Theoretical and Applied Genetics 96, 640644.
Lewontin, R. C. (1964). The interaction of selection and linkage. I. General considerations; heterotic models. Genetics 49, 4967.
Li, Y. C., Li, Y., Wu, S., Han, K., Wang, Z. J., Hou, W., Zeng, Y. R. & Wu, R. L. (2007). Estimation of linkage disequilibria in diploid populations with multilocus dominant markers. Genetics 176, 18791892.
Liu, T., Todhunter, R. J., Lu, Q., Schoettinger, L., Li, H. Y., Littell, R. C., Bliss, S., Acland, G., Lust, G. & Wu, R. L. (2006). Extent and distribution of zygotic linkage disequilibrium in canine. Genetics 174, 439453.
McRae, A. F., MceWan, J. C., Dodds, K. G., Wilson, T., Crawford, A. M. & Slate, J. (2002). Linkage disequilibrium in domestic sheep. Genetics 160, 11131122.
Miller, A. J. & Schaal, B. A. (2006). Domestication and the distribution of genetic variation in wild and cultivated populations of the Mesoamerican fruit tree Spondias purpurea L. (Anacardiaceae). Molecular Ecology 15, 14671480.
Nordborg, M. (2000). Linkage disequilibrium, gene trees and selfing: ancestral recombination graph with partial self-fertilization. Genetics 154, 923929.
Rafalski, A. & Morgante, M. (2004). Corn and humans: recombination and linkage disequilibrium in two genomes of similar size. Trends in Genetics 20, 103111.
Reich, D. E., Cargill, M., Bolk, S., Ireland, J., Sabeti, P. C., Richter, D. J., Lavery, T., Kouyoumjian, R., Farhadian, S. F., Ward, R. & Lander, E. S. (2001). Linkage disequilibrium in the human genome. Nature 411, 199204.
Remington, D. L., Thornsberry, J. M., Matsuoka, Y., Wilson, L. M., Whitt, S. R., Doebley, J., Kresovich, S., Goodman, M. M. & Buckler, E. S. IV (2001). Structure of linkage disequilibrium and phenotypic associations in the maize genome. Proceedings of the National Academy of Sciences of the USA 98, 1147911484.
Silbiger, R. N., Christ, S. A., Leonard, A. C., Garg, M., Lattier, D. L., Dawes, S., Dimsoski, P., McCormick, F., Wessendarp, T., Gordon, D. A., Roth, A. C., Smith, M. K. & Toth, G. P. (1998). Preliminary studies on the population genetics of the central stoneroller (Campostoma anomalum) from the Great Miami River Basin, Ohio. Environmental Monitoring and Assessment 51, 481495.
Tishkoff, S. A. & Williams, S. M. (2002). Genetic analysis of African populations: human evolution and complex disease. Nature Reviews Genetics 3, 611621.
Tishkoff, S. A., Dietzsch, E., Speed, W., Pakstis, A. J., Kidd, J. R., Cheung, K., Bonne-Tamir, B., Santachiara-Benerecetti, A. S., Moral, P., Krings, M., Pääbo, S., Watson, E., Risch, N., Jenkins, T. & Kidd, K. K. (1996). Global patterns of linkage disequilibrium at the CD4 locus and modern human origins. Science 271, 13801387.
Tishkoff, S. A., Varkonyi, R., Cahinhinan, N., Abbes, S., Argyropoulos, G., Destro-Bisol, G., Drousiotou, A., Dangerfield, B., Lefranc, G., Loiselet, J., Piro, A., Stoneking, M., Tagarelli, A., Tagarelli, G., Touma, E. H., Williams, S. M. & Clark, A. G. (2001). Haplotype diversity and linkage disequilibrium at human G6PD: recent origin of alleles that confer malarial resistance. Science 293, 455462.
Weir, B. S. (1996). Genetic Data Analysis. Sunderland, MA: Sinauer.
Wu, R. L. & Zeng, Z.-B. (2001). Joint linkage and linkage disequilibrium mapping in natural populations. Genetics 157, 899909.