Skip to main content Accessibility help

Population structure and breed composition prediction in a multi-breed sheep population using genome-wide single nucleotide polymorphism genotypes

  • A. C. O’Brien (a1) (a2), D. C. Purfield (a1), M. M. Judge (a1), C. Long (a3), S. Fair (a2) and D. P. Berry (a1)...


Knowledge of population structure and breed composition of a population can be advantageous for a number of reasons; these include designing optimal (cross)breeding strategies in order to maximise non-additive genetic effects, maintaining flockbook integrity by authenticating animals being registered and as a quality control measure in the genotyping process. The objectives of the present study were to 1) describe the population structure of 24 sheep breeds, 2) quantify the breed composition of both flockbook-recorded and crossbred animals using single nucleotide polymorphism BLUP (SNP-BLUP), and 3) quantify the accuracy of breed composition prediction from low-density genotype panels containing between 2000 and 6000 SNPs. In total, 9334 autosomal SNPs on 11 144 flockbook-recorded animals and 1172 crossbred animals were used. The population structure of all breeds was characterised by principal component analysis (PCA) as well as the pairwise breed fixation index (Fst). The total number of animals, all of which were purebred, included in the calibration population for SNP-BLUP was 2579 with the number of animals per breed ranging from 9 to 500. The remaining 9559 flockbook-recorded animals, composite breeds and crossbred animals represented the test population; three breeds were excluded from breed composition prediction. The breed composition predicted using SNP-BLUP with 9334 SNPs was considered the gold standard prediction. The pairwise breed Fst ranged from 0.040 (between the Irish Blackface and Scottish Blackface) to 0.282 (between the Border Leicester and Suffolk). Principal component analysis revealed that the Suffolk from Ireland and the Suffolk from New Zealand formed distinct, non-overlapping clusters. In contrast, the Texel from Ireland and that from New Zealand formed integrated, overlapping clusters. Composite animals such as the Belclare clustered close to its founder breeds (i.e., Finn, Galway, Lleyn and Texel). When all 9334 SNPs were used to predict breed composition, an animal that had a majority breed proportion predicted to be ≥0.90 was defined as purebred for the present study. As the panel density decreased, the predicted breed proportion threshold, used to identify animals as purebred, also decreased (≥0.85 with 6000 SNPs to ≥0.60 with 2000 SNPs). In all, results from the study suggest that breed composition for purebred and crossbred animals can be determined with SNP-BLUP using ≥5000 SNPs.


Corresponding author


Hide All
Alexander, DH, Novembre, J and Lange, K 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Research 19, 16551664.
Beynon, SE, Slavov, GT, Farré, M, Bolormaa, S, Waddams, K, Davies, B, Haresign, W, Kijas, J, MacLeod, IM, Newbold, J, Davies, L and Larkin, DM 2015. Population structure and history of the Welsh sheep breeds determined by whole genome genotyping. BMC Genetics 16, 65.
Crum, TE, Schnabel, RD, Decker, JE, Regitano, LCA and Taylor, JF 2018. CRUMBLER: a tool for the prediction of ancestry in cattle. Available at:
Dodds, KG, Aurvay, B, Newman, SN and McEwan, J 2014. Genomic breed prediction in New Zealand sheep. BMC Genetics 15, 92.
Frkonja, A, Gredler, B, Schnyder, U, Curik, I and Sölkner, J 2012. Prediction of breed composition in an admixed cattle population. Animal Genetics, 43, 696703.
Hanrahan, JP 2002. Response to divergent selection for ovulation rate in Finn sheep. In Proceedings of the 7th World Congress on Genetics Applied to Livestock Production, 19th to 23rd August 2002, Montpellier, France, INRA, pp. 673–676.
Judge, MM, Kelleher, MM, Kearney, JF, Sleator, RD and Berry, DP 2017. Ultra-low-density genotype panels for breed assignment of Angus and Hereford cattle. Animal 11, 938947.
Kelleher, MM, Berry, DP, Kearney, JF and McParland, S 2017. Inference of population structure of purebred dairy and beef cattle using high-density genotype data. Animal 11, 1523.
Kijas, JW, Lenstra, JA, Hayes, B, Boitard, S, Porto Neto, LR, San Cristobal, M, Servin, B, McCulloch, R, Whan, V, Gietzen, K and Paiva, S 2012. Genome-wide analysis of the world’s sheep breeds reveals high levels of historic mixture and strong recent selection. PLoS Biology 10, e1001258.
Kuehn, LA, Keele, JW, Bennett, JW, Mc Daneld, TG, Smith, TPL, Snelling, WM, Sonstegard, TS and Thallman, RM 2011. Predicting breed composition using frequencies of 50,000 markers from the US Meat Animal Research Centre 2,000 Bull Project. Journal of Animal Science 89, 17421750.
Lawson Handley, L-J, Byrne, K, Santucci, F, Townsend, S, Taylor, M, Bruford, MWand Hewitt, GM and 2007. Genetic structure of European sheep breeds. Heredity 99, 620631.
McGovern, SP, Purfield, DC, Ring, SC, Carthy, TR, Graham, DA and Berry, DP 2019. Candidate genes associated with the heritable humoral response to Mycobacterium avium subspecies paratuberculosis in dairy cows have factors in common with gastrointestinal diseases in humans. Journal of Dairy Science 102, 115.
McHugh, N, Pabiou, T, Wall, E, McDermott, K and Berry, DP 2017. Impact of alternative definitions of contemporary groups on genetic evaluations of traits recorded at lambing. Journal of Animal Science 95, 19261938.
MiX99 Development Team 2015. Biometrical genetics, Natural Resources Institute Finland (Luke), Jokioinen, Finland.
McParland, S, Kearney, JF, Rath, M and Berry, DP 2007. Inbreeding trends and pedigree analysis of Irish dairy and beef cattle populations. Journal of Animal Science 85, 322331.
O’Brien, AC, Judge, MM, Fair, S and Berry, DP 2019. High imputation accuracy from informative low-density to medium-density single nucleotide polymorphism genotypes is achievable in sheep. Journal of Animal Science 97, 15501567.
Price, AL, Patterson, NJ, Plenge, RM, Weinblatt, ME, Shadick, NA and Reich, D 2006. Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics 38, 904909.
Purfield, DC, McParland, S, Wall, E and Berry, DP 2017. The distribution of runs of homozygosity and selection signatures in six commercial meat sheep breeds. PLoS ONE 12, e0176780.
Rasali, DP, Shrestha, JNB and Crow, GH 2005. Development of composite sheep breeds in the world: a review. Canadian Journal of Animal Science 86, 124.
Santos, BFS, McHugh, N, Byrne, TJ, Berry, DP and Amer, PR 2015. Comparison of breeding objectives across countries with application to sheep indexes in New Zealand and Ireland. Journal of Animal Breeding and Genetics 132, 144154.
Sargolzaei, M, Chesnais, JP and Schenkel, FS 2014. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 478.
Strucken, EM, Al-Mamun, HA, Esquivelzeta-Rabell, C, Gondro, C, Mwai, OA and Gibson, JP 2017. Genetic tests for estimating dairy breed proportion and parentage assignment in East African crossbred cattle. Genetic Selection Evolution 49, 67.
Sölkner, J. Frkonja, A, Raadsma, HW, Jonas, E, Thaller, G, Gootwine, E, Seroussi, E, Fuerst, C, Egger-Danner, C and Gredler, B 2010. Estimation of Individual Levels of Admixture in Crossbred Populations from SNP Chip Data: examples with Sheep and Cattle Populations. Interbull Bulletin 42, 6266.
Twomey, AJ, Berry, BP, Evan, RD, Doherty, ML, Graham, DA and Purfield, DC 2019. Genome-wide association study for endo-parasite phenotypes using imputed whole-genome sequence data in dairy and beef cattle. Genetics Selection Evolution 51, 15.


Type Description Title
Supplementary materials

O’Brien et al. supplementary material
Tables S1-S4 and Figure S1

 Word (2.3 MB)
2.3 MB

Population structure and breed composition prediction in a multi-breed sheep population using genome-wide single nucleotide polymorphism genotypes

  • A. C. O’Brien (a1) (a2), D. C. Purfield (a1), M. M. Judge (a1), C. Long (a3), S. Fair (a2) and D. P. Berry (a1)...


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed