Skip to main content Accessibility help

Imputation of ungenotyped parental genotypes in dairy and beef cattle from progeny genotypes

  • D. P. Berry (a1), S. McParland (a1), J. F. Kearney (a2), M. Sargolzaei (a3) and M. P. Mullen (a4)...


The objective of this study was to quantify the accuracy of imputing the genotype of parents using information on the genotype of their progeny and a family-based and population-based imputation algorithm. Two separate data sets were used, one containing both dairy and beef animals (n=3122) with high-density genotypes (735 151 single nucleotide polymorphisms (SNPs)) and the other containing just dairy animals (n=5489) with medium-density genotypes (51 602 SNPs). Imputation accuracy of three different genotype density panels were evaluated representing low (i.e. 6501 SNPs), medium and high density. The full genotypes of sires with genotyped half-sib progeny were masked and subsequently imputed. Genotyped half-sib progeny group sizes were altered from 4 up to 12 and the impact on imputation accuracy was quantified. Up to 157 and 258 sires were used to test the accuracy of imputation in the dairy plus beef data set and the dairy-only data set, respectively. The efficiency and accuracy of imputation was quantified as the proportion of genotypes that could not be imputed, and as both the genotype concordance rate and allele concordance rate. The median proportion of genotypes per animal that could not be imputed in the imputation process decreased as the number of genotyped half-sib progeny increased; values for the medium-density panel ranged from a median of 0.015 with a half-sib progeny group size of 4 to a median of 0.0014 to 0.0015 with a half-sib progeny group size of 8. The accuracy of imputation across different paternal half-sib progeny group sizes was similar in both data sets. Concordance rates increased considerably as the number of genotyped half-sib progeny increased from four (mean animal allele concordance rate of 0.94 in both data sets for the medium-density genotype panel) to five (mean animal allele concordance rate of 0.96 in both data sets for the medium-density genotype panel) after which it was relatively stable up to a half-sib progeny group size of eight. In the data set with dairy-only animals, sufficient sires with paternal half-sib progeny groups up to 12 were available and the within-animal mean genotype concordance rates continued to increase up to this group size. The accuracy of imputation was worst for the low-density genotypes, especially with smaller half-sib progeny group sizes but the difference in imputation accuracy between density panels diminished as progeny group size increased; the difference between high and medium-density genotype panels was relatively small across all half-sib progeny group sizes. Where biological material or genotypes are not available on individual animals, at least five progeny can be genotyped (on either a medium or high-density genotyping platform) and the parental alleles imputed with, on average, ⩾96% accuracy.


Corresponding author


Hide All
Berry, DP and Kearney, JF 2011. Imputation of genotypes from low- to high-density genotyping platforms and implications for genomic selection. Animal 5, 11621169.
Berry, DP, McClure, MC and Mullen, MP 2014. Within and across-breed imputation of high density genotypes in dairy and beef cattle from medium and low density genotypes. Journal of Animal Breeding and Genetics (in press), doi:10.1111/jbg.12067.
Browning, BL and Browning, SR 2009. A unified approach to genotype imputation and haplotype phase inference for large data sets of trios and unrelated individuals. American Journal of Human Genetics 84, 210223.
Browning, SR and Browning, BL 2007. Rapid and accurate haplotype phasing and missing data inference for whole genome association studies using localized haplotype clustering. American Journal of Human Genetics 81, 10841097.
Cromie, AR, Berry, DP, Wickham, B, Kearney, JF, Pena, J, van Kaam, JBCH, Gengler, N, Szyda, J, Schnyder, U, Coffey, M, Moster, B, Hagiya, K, Weller, JI, Abernethy, D and Spelman, R 2010. International genomic co-operation; who, what, when, where, why and how? InterBull Conference, No. 42, Riga, Latvia, 31 May, pp. 72–80.
Daetwyler, HD, Villanueva, B and Woolliams, JA 2008. Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One 3, e3395.
Dassonneville, R, Fritz, S, Ducrocq, V and Boichard, D 2012. Imputation performance of 3 low-density marker panels in beef and dairy cattle. Journal of Dairy Science 95, 41364140.
David, X, de Vries, A, Feddersen, E and Borchersen, S 2010. International genomic cooperation – EuroGenomics significantly improves reliability of genomic evaluations. Proceedings of the Interbull International Workshop, No. 41, Paris, France, 4–5 March, pp. 77–78.
Habier, D, Fernando, RL, Kizilkaya, K and Garrick, DJ 2011. Extension of the Bayesian alphabet for genomic selection. BMC Bioinformatics 12, 186.
Hayes, BJ, Bowman, PJ, Chamberlain, AJ and Goddard, ME 2009. Invited review: genomic selection in dairy cattle: progress and challenges. Journal of Dairy Science 92, 433443.
Huang, Y, Maltecca, C, Cassady, JP, Alexander, LJ, Snelling, WM and MacNeil, MD 2012. Effects of reduced panel, reference origin, and genetic relationship on imputation of genotypes in Hereford cattle. Journal of Animal Science 90, 42034208.
Jorjani, H, Zumbach, B, Dürr, J and Santus, E 2010. Joint genomic evaluation of BSW populations. Proceedings of the Interbull International Workshop, No. 41, Paris, France, 4–5 March, pp. 8–16.
Lund, MS, Roos, APW, de Vries, AG, Druet, T, Ducrocq, V, Fritz, S, Guillaume, F, Guldbrandtsen, B, Liu, Z, Reents, R, Schrooten, C, Seefried, F and Su, G 2011. A common reference population from four European Holstein populations increases reliability of genomic predictions. Genetics, Selection, Evolution 43, 43.
Meredith, BK, Kearney, JF, Finlay, EK, Bradley, DG, Fahey, AG, Berry, DP and Lynn, DJ 2012. Genome-wide associations for milk production and somatic cell score in Holstein-Friesian cattle in Ireland. BMC Genetics 13, 21.
Meuwissen, THE, Hayes, BJ and Goddard, ME 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 18191829.
Muir, B, Van Doormaal, B and Kistemaker, G 2010. International genomic cooperation – North American perspective. Proceedings of the Interbull International Workshop, No. 41, Paris, France, 4–5 March, pp. 71–76.
Pimentel, ECG, Wensch-Dorendorf, M, König, S and Swalve, HH 2013. Enlarging a training set for genomic selection by imputation of un-genotyped animals in populations of varying genetic architecture. Genetics, Selection, Evolution 45, 12.
Pryce, JE and Hayes, BJ 2012. A review of how dairy farmers can use and profit from genomic technologies. Animal Production Science 52, 180184.
Pszczola, M, Mulder, HA and MPL, Calus 2011. Effect of enlarging the reference population with (un)genotyped animals on the accuracy of genomic selection in dairy cattle. Journal of Dairy Science 94, 431441.
Saatchi, M, Schnabel, RD, Rolf, MM, Taylor, JF and Garrick, DJ 2012. Accuracy of direct genomic breeding values for nationally evaluated traits in US Limousin and Simmental beef cattle. Genetics Selection Evolution 44, 38.
Sargolzaei, M, Chesnais, JP and Schenkel, FS 2011. FImpute – an efficient imputation algorithm for dairy cattle populations. Journal of Dairy Science 94, 421.
VanRaden, PM, Null, DJ, Sargolzaei, M, Wiggans, GR, Tooker, ME, Cole, JB, Sonstegard, TS, Connor, EE, Winters, M, van Kaam, JBCHM, Valentini, A, Van Doormaal, BJ, Faust, MA and Doak, GA 2013. Genomic imputation and evaluation using high-density Holstein genotypes. Journal of Dairy Science 96, 668678.
Venot, E, Pabiou, T, Fouilloux, M-N, Coffey, M, Laloë, D, Guerrier, J, Cromie, A, Journaux, L, Flynn, J and Wickham, B 2007. Interbeef in practice: example of a joint genetic evaluation between France, Ireland and United Kingdom for pure bred Limousine weaning weights. Proceedings of the Interbull International Workshop, No. 36, Paris, France, 9–10 March, pp. 41–48.



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed