Skip to main content Accessibility help

The effect of missing marker genotypes on the accuracy of gene-assisted breeding value estimation: a comparison of methods*

  • H. A. Mulder (a1), T. H. E. Meuwissen (a2), M. P. L. Calus (a1) and R. F. Veerkamp (a1)


In livestock populations, missing genotypes on a large proportion of the animals is a major problem when implementing gene-assisted breeding value estimation for genes with known effect. The objective of this study was to compare different methods to deal with missing genotypes on accuracy of gene-assisted breeding value estimation for identified bi-allelic genes using Monte Carlo simulation. A nested full-sib half-sib structure was simulated with a mixed inheritance model with one bi-allelic quantitative trait loci (QTL) and a polygenic effect due to infinite number of polygenes. The effect of the QTL was included in gene-assisted BLUP either by random regression on predicted gene content, i.e. the number of positive alleles, or including haplotype effects in the model with an inverse IBD matrix to account for identity-by-descent relationships between haplotypes using linkage analysis information (IBD–LA). The inverse IBD matrix was constructed using segregation indicator probabilities obtained from multiple marker iterative peeling. Gene contents for unknown genotypes were predicted using either multiple marker iterative peeling or mixed model methodology. For both methods, gene-assisted breeding value estimation increased accuracies of total estimated breeding value (EBV) with 0% to 22% for genotyped animals in comparison to conventional breeding value estimation. For animals that were not genotyped, the increase in accuracy was much lower (0% to 5%), but still substantial when the heritability was 0.1 and when the QTL explained at least 15% of the genetic variance. Regression on predicted gene content yielded higher accuracies than IBD–LA. Allele substitution effects were, however, overestimated, especially when only sires and males in the last generation were genotyped. For juveniles without phenotypic records and traits measured only on females, the superiority of regression on gene content over IBD–LA was larger than when all animals had phenotypes. Missing gene contents were predicted with higher accuracy using multiple-marker iterative peeling than with using mixed model methodology, but the difference in accuracy of total EBV was negligible and mixed model methodology was computationally much faster than multiple iterative peeling. For large livestock populations it can be concluded that gene-assisted breeding value estimation can be practically best performed by regression on gene contents, using mixed model methodology to predict missing marker genotypes, combining phenotypic information of genotyped and ungenotyped animals in one evaluation. This technique would be, in principle, also feasible for genomic selection. It is expected that genomic selection for ungenotyped animals using predicted single nucleotide polymorphism gene contents might be beneficial especially for low heritable traits.


Corresponding author


Hide All

This paper was presented at the session ‘Genomics selection and bioinformatics’ of the 59th Annual meeting of the European Association for Animal Production held in Vilnius (Lithuania), 24–27 August 2008. Dr A. Maki-Tanila acted as guest editor.



Hide All
Ansari-Mahyari, S, Sorensen, AC, Lund, MS, Thomsen, H, Berg, P 2008. Across-family marker-assisted selection using selective genotyping strategies in dairy cattle breeding schemes. Journal of Dairy Science 91, 16281639.
Baruch, E, Weller, JI 2009. Incorporation of genotype effects into animal model evaluations when only a small proportion of the population has been genotyped. Animal 3, 1623.
Bulmer, MG 1971. The effect of selection on genetic variability. American Naturalist 105, 201211.
Dekkers, JCM 2004. Commercial application of marker- and gene-assisted selection in livestock: strategies and lessons. Journal of Animal Science 82 (suppl.), E313E328.
Dekkers, JCM, Van der Werf, JHJ 2007. Strategies, limitations and opportunities for marker-assisted selection in livestock. In Marker-assisted selection: current status and future perspectives in crops, livestock, forestry and fish (ed. EP Guimaraes, J Ruane, BD Scherf, A Sonnino and JD Dargie), pp. 167184. FAO, Rome, Italy.
Fernando, RL, Grossman, M 1989. Marker assisted selection using best linear unbiased prediction. Genetics Selection Evolution 21, 467477.
Fernando, RL, Stricker, C, Elston, RC 1993. An efficient algorithm to compute the posterior genotypic distribution for every member of a pedigree without loops. Theoretical and Applied Genetics 87, 8993.
Gengler, N, Mayeres, P, Szydlowski, M 2007. A simple method to approximate gene content in large pedigree populations: application to the myostatin gene in dual-purpose Belgian Blue cattle. Animal 1, 2127.
Gengler, N, Abras, S, Verkenne, C, Vanderick, S, Szydlowski, M, Renaville, R 2008. Accuracy of prediction of gene content in large animal populations and its use for candidate gene detection and genetic evaluation. Journal of Dairy Science 91, 16521659.
Goddard, ME 2009. Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245257.
Grisart, N, Coppieters, W, Farnir, F, Karim, L, Ford, C, Berzi, P, Cambisamo, N, Mni, MRS, Simon, P, Spelman, R, Georges, M, Snell, R 2002. Positional candidate cloning of a QTL in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Research 12, 222231.
Haldane, JBS 1919. The combination of linkage values and the calculation of distances between the loci of linked factors. Journal of Genetics 8, 299309.
Hill, WG, Robertson, A 1968. Linkage disequilibrium in finite populations. Theoretical Applied Genetics 38, 226231.
Hoeschele, I 1993. Elimination of quantitative trait loci equations in an animal model incorporating genetic marker data. Journal of Dairy Science 76, 16931713.
Israel, C, Weller, JI 1998. Estimation of candidate gene effects in dairy cattle populations. Journal of Dairy Science 81, 16531662.
Lande, R, Thompson, R 1990. Efficiency of marker-assisted selection in the improvement of quantitative traits. Genetics 124, 743756.
Lidauer, M, Stranden, I 1999. Fast and flexible program for genetic evaluation in dairy cattle. In Proceedings of the Computational Cattle Breeding ‘99 Workshop, March 18–20, 1999, Tuusala, Finland. Interbull Bulletin 20, pp. 20–25.
Martens, H, Naess, T 1989. Multivariate calibration. Wiley, New York, USA.
Meuwissen, THE 2006. Determining haplotypes and IBD-probabilities from dense-marker genotypes in large complex pedigrees. In Proceedings of the 8th World Congress on Genetics Applied to Livestock Production, Communication 20–12, Belo Horizonte, Brazil.
Meuwissen, THE, Luo, Z 1992. Computing inbreeding coefficients in large populations. Genetics Selection Evolution 24, 305313.
Meuwissen, THE, Goddard, ME 1996. The use of marker haplotypes in animal breeding schemes. Genetics Selection Evolution 28, 161176.
Meuwissen, THE, Goddard, ME 1999. Marker assisted estimation of breeding values when marker information is missing on many animals. Genetics Selection Evolution 31, 375394.
Meuwissen, THE, Hayes, BJ, Goddard, ME 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 18191829.
Meuwissen, THE, Karlsen, A, Lien, S, Olsaker, I, Goddard, ME 2002. Fine mapping of a quantitative trait locus for twinning rate using combined linkage and linkage disequilibrium mapping. Genetics 161, 373379.
Thallman, RM, Bennett, GL, Keele, JW, Kappes, SM 2001a. Efficient computation of genotype probabilities for loci with many alleles: I. Allelic peeling. Journal of Animal Science 79, 2633.
Thallman, RM, Bennett, GL, Keele, JW, Kappes, SM 2001b. Efficient computation of genotype probabilities for loci with many alleles: II. Iterative method for large, complex pedigrees. Journal of Animal Science 79, 3444.
Totir, LR, Fernando, RL, Dekkers, JCM, Fernandez, SA, Guldbrandtsen, B 2004. The effect of using approximate gametic variance covariance matrices on marker assisted selection by BLUP. Genetics Selection Evolution 36, 2948.
Van Arendonk, JAM, Smith, C, Kennedy, BW 1989. Method to estimate genotype probabilities at individual loci in farm livestock. Theoretical and Applied Genetics 78, 735740.
Van Laere, A-S, Nguyen, M, Braunschweig, M, Nezer, C, Collete, C, Moreau, L, Archibald, AL, Haley, CS, Buys, N, Tally, M, Andersson, G, Georges, M, Andersson, L 2003. A regulatory mutation in IGF2 causes a major QTL effect on muscle growth in pigs. Nature 425, 832836.
VanRaden, PM 2008. Efficient methods to compute genomic predictions. Journal of Dairy Science 91, 44144423.
Villanueva, B, Pong-Wong, R, Woolliams, JA 2002. Marker assisted selection with optimised contributions of the candidates to selection. Genetics Selection Evolution 34, 679703.
Weller, JI 2001. Quantitative trait loci analysis in animals. CAB International, Wallingford, UK.
Weller, JI 2007. Marker-assisted selection in dairy cattle. In Marker-assisted selection: current status and future perspectives in crops, livestock, forestry and fish (ed. EP Guimaraes, J Ruane, BD Scherf, A Sonnino and JD Dargie), pp. 197228. FAO, Rome, Italy.
Winter, A, Kramer, W, Werner, FAO, Kollers, S, Kata, S, Durstewitz, G, Buitkamp, J, Womack, JE, Thaller, G, Fries, R 2002. Association of a lysine-232/alanine polymorphism in a bovine gene encoding acyl-CoA: diacylglycerol acyltransferase (DGAT1) with variation at a quantitative trait locus for milk fat content. Proceedings of the National Academy of Sciences of the United States of America 99, 93009305.


The effect of missing marker genotypes on the accuracy of gene-assisted breeding value estimation: a comparison of methods*

  • H. A. Mulder (a1), T. H. E. Meuwissen (a2), M. P. L. Calus (a1) and R. F. Veerkamp (a1)


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed