Skip to main content Accessibility help
×
Home

Comparison of genotype imputation strategies using a combined reference panel for chicken population

  • S. Ye (a1), X. Yuan (a1), S. Huang (a1), H. Zhang (a1), Z. Chen (a1), J. Li (a1), X. Zhang (a1) and Z. Zhang (a1)...

Abstract

Using whole-genome sequence (WGS) data are supposed to be optimal for genome-wide association studies and genomic predictions. However, sequencing thousands of individuals of interest is expensive. Imputation from single nucleotide polymorphisms panels to WGS data is an attractive approach to obtain highly reliable WGS data at low cost. Here, we conducted a genotype imputation study with a combined reference panel in yellow-feather dwarf broiler population. The combined reference panel was assembled by sequencing 24 key individuals of a yellow-feather dwarf broiler population (internal reference panel) and WGS data from 311 chickens in public databases (external reference panel). Three scenarios were investigated to determine how different factors affect the accuracy of imputation from 600 K array data to WGS data, including: genotype imputation with internal, external and combined reference panels; the number of internal reference individuals in the combined reference panel; and different reference sizes and selection strategies of an external reference panel. Results showed that imputation accuracy from 600 K to WGS data were 0.834±0.012, 0.920±0.007 and 0.982±0.003 for the internal, external and combined reference panels, respectively. Increasing the reference size from 50 to 250 improved the accuracy of genotype imputation from 0.848 to 0.974 for the combined reference panel and from 0.647 to 0.917 for the external reference panel. The selection strategies for the external reference panel had no impact on the accuracy of imputation using the combined reference panel. However, if only an external reference panel with reference size >50 was used, the selection strategy of minimizing the average distance to the closest leaf had the greatest imputation accuracy compared with other methods. Generally, using a combined reference panel provided greater imputation accuracy, especially for low-frequency variants. In conclusion, the optimal imputation strategy with a combined reference panel should comprehensively consider genetic diversity of the study population, availability and properties of external reference panels, sequencing and computing costs, and frequency of imputed variants. This work sheds light on how to design and execute genotype imputation with a combined external reference panel in a livestock population.

Copyright

Corresponding author

References

Hide All
Abecasis, GR, Auton, A, Brooks, LD, DePristo, MA, Durbin, RM, Handsaker, RE, Kang, HM, Marth, GT and McVean, GA, Genomes Project C 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665.
Altshuler, DM, Gibbs, RA, Peltonen, L, Altshuler, DM, Gibbs, RA, Peltonen, L, Dermitzakis, E, Schaffner, SF, Yu, F, Peltonen, L, Dermitzakis, E, Bonnen, PE, Altshuler, DM, Gibbs, RA, de Bakker, PI, Deloukas, P, Gabriel, SB, Gwilliam, R, Hunt, S, Inouye, M, Jia, X, Palotie, A, Parkin, M, Whittaker, P, Yu, F, Chang, K, Hawes, A, Lewis, LR, Ren, Y, Wheeler, D, Gibbs, RA, Muzny, DM, Barnes, C, Darvishi, K, Hurles, M, Korn, JM, Kristiansson, K, Lee, C, McCarrol, SA, Nemesh, J, Dermitzakis, E, Keinan, A, Montgomery, SB, Pollack, S, Price, AL, Soranzo, N, Bonnen, PE, Gibbs, RA, Gonzaga-Jauregui, C, Keinan, A, Price, AL, Yu, F, Anttila, V, Brodeur, W, Daly, MJ, Leslie, S, McVean, G, Moutsianas, L, Nguyen, H, Schaffner, SF, Zhang, Q, Ghori, MJ, McGinnis, R, McLaren, W, Pollack, S, Price, AL, Schaffner, SF, Takeuchi, F, Grossman, SR, Shlyakhter, I, Hostetter, EB, Sabeti, PC, Adebamowo, CA, Foster, MW, Gordon, DR, Licinio, J, Manca, MC, Marshall, PA, Matsuda, I, Ngare, D, Wang, VO, Reddy, D, Rotimi, CN, Royal, CD, Sharp, RR, Zeng, C, Brooks, LD and McEwen, JE 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467, 5258.
Auton, A, Brooks, LD, Durbin, RM, Garrison, EP, Kang, HM, Korbel, JO, Marchini, JL, McCarthy, S, McVean, GA and Abecasis, GR 2015. A global reference for human genetic variation. Nature 526, 6874.
Bomba, L, Walter, K and Soranzo, N 2017. The impact of rare and low-frequency genetic variants in common disease. Genome Biology 18, 77.
Browning, B and Browning, S 2016. Genotype imputation with millions of reference samples. American Journal of Human Genetics 98, 116126.
Chou, WC, Zheng, HF, Cheng, CH, Yan, H, Wang, L, Han, F, Richards, JB, Karasik, D, Kiel, DP and Hsu, YH 2016. A combined reference panel from the 1000 Genomes and UK10K projects improved rare variant imputation in European and Chinese samples. Scientific Reports 6, 39313.
Druet, T, Macleod, IM and Hayes, BJ 2014. Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity 112, 3947.
Friedenberg, SG and Meurs, KM 2016. Genotype imputation in the domestic dog. Mammalian Genome 27, 485494.
Frischknecht, M, Neuditschko, M, Jagannathan, V, Drogemuller, C, Tetens, J, Thaller, G, Leeb, T and Rieder, S 2014. Imputation of sequence level genotypes in the Franches-Montagnes horse breed. Genetics Selection Evolution 46, 63.
Grossi, DA, Brito, LF, Jafarikia, M, Schenkel, FS and Feng, Z 2018. Genotype imputation from various low-density SNP panels and its impact on accuracy of genomic breeding values in pigs. Animal 12, 2235–2245.
Hayes, B, Fries, R, Lund, MS, Boichard, D, Stothard, P, Veerkamp, RF, Van Tassell, C, Anderson, C, Hulsegge, I and Guldbrandtsen, B 2012. 1000 Bull Genomes Consortium Project. In Plant and Animal Genome XX Conference, 14–18 January 2012, San Diego, CA, USA.
Jostins, L, Morley, KI and Barrett, JC 2011. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. European Journal of Human Genetics 19, 662666.
Kang, JT, Zhang, P, Zollner, S and Rosenberg, NA 2015. Choosing subsamples for sequencing studies by minimizing the average distance to the closest leaf. Genetics 201, 499511.
Kranis, A, Gheyas, AA, Boschiero, C, Turner, F, Yu, L, Smith, S, Talbot, R, Pirani, A, Brew, F, Kaiser, P, Hocking, PM, Fife, M, Salmon, N, Fulton, J, Strom, TM, Haberer, G, Weigend, S, Preisinger, R, Gholami, M, Qanbari, S, Simianer, H, Watson, KA, Woolliams, JA and Burt, DW 2013. Development of a high density 600K SNP genotyping array for chicken. BMC Genomics 14, 59.
Kreinermøller, E, Medinagomez, C, Uitterlinden, AG, Rivadeneira, F and Estrada, K 2015. Improving accuracy of rare variant imputation with a two-step imputation approach. European Journal of Human Genetics 23, 395400.
Leeuwen, EMV, Sabo, A, Bis, JC, Huffman, JE, Manichaikul, A, Smith, AV, Feitosa, MF, Demissie, S, Joshi, PK and Duan, Q 2016. Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. Journal of Medical Genetics 53, 441449.
Li, H and Durbin, R 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760.
Li, H, Handsaker, B, Wysoker, A, Fennell, T, Ruan, J, Homer, N, Marth, G, Abecasis, G and Durbin, R, Genome Project Data Processing S 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 20782079.
Li, Y and Al, E 2009. Genotype imputation. Annual Review of Genomics & Human Genetics 10, 387406.
Littlejohn, MD, Tiplady, K, Fink, TA, Lehnert, K, Lopdell, T, Johnson, T, Couldrey, C, Keehan, M, Sherlock, RG, Harland, C, Scott, A, Snell, RG, Davis, SR and Spelman, RJ 2016. Sequence-based association analysis reveals an MGST1 eQTL with pleiotropic effects on bovine milk composition. Scientific Reports 6, 25376.
Lyimo, CM, Weigend, A, Msoffe, PL, Eding, H, Simianer, H and Weigend, S 2015. Global diversity and genetic contributions of chicken populations from African, Asian and European regions. Animal Genetics 45, 836848.
Manolio, TA, Collins, FS, Cox, NJ, Goldstein, DB, Hindorff, LA, Hunter, DJ, McCarthy, MI, Ramos, EM, Cardon, LR, Chakravarti, A, Cho, JH, Guttmacher, AE, Kong, A, Kruglyak, L, Mardis, E, Rotimi, CN, Slatkin, M, Valle, D, Whittemore, AS, Boehnke, M, Clark, AG, Eichler, EE, Gibson, G, Haines, JL, Mackay, TF, McCarroll, SA and Visscher, PM 2009. Finding the missing heritability of complex diseases. Nature 461, 747753.
Marchini, J and Howie, B 2010. Genotype imputation for genome-wide association studies. Nature Reviews Genetics 11, 499511.
Mathieson, I and McVean, G 2012. Differential confounding of rare and common variants in spatially structured populations. Nature Genetics 44, 243246.
Matsen, FAT, Gallagher, A and McCoy, CO 2013. Minimizing the average distance to a closest leaf in a phylogenetic tree. Systematic Biology 62, 824836.
McKenna, A, Hanna, M, Banks, E, Sivachenko, A, Cibulskis, K, Kernytsky, A, Garimella, K, Altshuler, D, Gabriel, S, Daly, M and DePristo, MA 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 12971303.
Mitt, M, Kals, M, Parn, K, Gabriel, SB, Lander, ES, Palotie, A, Ripatti, S, Morris, AP, Metspalu, A, Esko, T, Magi, R and Palta, P 2017. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. European Journal of Human Genetics 25, 869876.
Ni, GY, Strom, TM, Pausch, H, Reimer, C, Preisinger, R, Simianer, H and Erbe, M 2015. Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken. BMC Genomics 16, 112.
Pimentel, EC, Edel, C, Emmerling, R and Götz, KU 2015. Imputation errors bias genomic predictions. Journal of Dairy Science 98, 41314138.
Roshyara, NR and Scholz, M 2015. Impact of genetic similarity on imputation accuracy. BMC Genetics 16, 90.
Surakka, I, Sarin, AP, Ruotsalainen, SE, Durbin, R, Salomaa, V, Daly, M, Palotie, A and Ripatti, S 2016. The rate of false polymorphisms introduced when imputing genotypes from global imputation panels. BioRxiv, https://doi.org/10.1101/080770, Published online by by Cold Spring Harbor Laboratory Press 13 Octorber 2016.
Ulfah, M, Kawahara-Miki, R, Farajalllah, A, Muladno, M, Dorshorst, B, Martin, A and Kono, T 2016. Genetic features of red and green junglefowls and relationship with Indonesian native chickens Sumatera and Kedu Hitam. BMC Genomics 17, 320.
VanRaden, PM 2008. Efficient methods to compute genomic predictions. Journal of Dairy Science 91, 44144423.
Ye, S, Yuan, X, Lin, X, Gao, N, Luo, Y, Chen, Z, Li, J, Zhang, X and Zhang, Z 2018. Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population. Journal of Animal Science and Biotechnology 9, 30.
Zhang, P, Zhan, X, Rosenberg, NA and Zollner, S 2013. Genotype imputation reference panel selection using maximal phylogenetic diversity. Genetics 195, 319330.

Keywords

Type Description Title
WORD
Supplementary materials

Ye et al. supplementary material
Ye et al. supplementary material 1

 Word (974 KB)
974 KB

Metrics

Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed