Hostname: page-component-76fb5796d-22dnz Total loading time: 0 Render date: 2024-04-25T20:55:43.324Z Has data issue: false hasContentIssue false

Comparison of genotype imputation strategies using a combined reference panel for chicken population

Published online by Cambridge University Press:  29 October 2018

S. Ye
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
X. Yuan
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
S. Huang
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
H. Zhang
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
Z. Chen
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
J. Li
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
X. Zhang
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
Z. Zhang*
Affiliation:
Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
*
Get access

Abstract

Using whole-genome sequence (WGS) data are supposed to be optimal for genome-wide association studies and genomic predictions. However, sequencing thousands of individuals of interest is expensive. Imputation from single nucleotide polymorphisms panels to WGS data is an attractive approach to obtain highly reliable WGS data at low cost. Here, we conducted a genotype imputation study with a combined reference panel in yellow-feather dwarf broiler population. The combined reference panel was assembled by sequencing 24 key individuals of a yellow-feather dwarf broiler population (internal reference panel) and WGS data from 311 chickens in public databases (external reference panel). Three scenarios were investigated to determine how different factors affect the accuracy of imputation from 600 K array data to WGS data, including: genotype imputation with internal, external and combined reference panels; the number of internal reference individuals in the combined reference panel; and different reference sizes and selection strategies of an external reference panel. Results showed that imputation accuracy from 600 K to WGS data were 0.834±0.012, 0.920±0.007 and 0.982±0.003 for the internal, external and combined reference panels, respectively. Increasing the reference size from 50 to 250 improved the accuracy of genotype imputation from 0.848 to 0.974 for the combined reference panel and from 0.647 to 0.917 for the external reference panel. The selection strategies for the external reference panel had no impact on the accuracy of imputation using the combined reference panel. However, if only an external reference panel with reference size >50 was used, the selection strategy of minimizing the average distance to the closest leaf had the greatest imputation accuracy compared with other methods. Generally, using a combined reference panel provided greater imputation accuracy, especially for low-frequency variants. In conclusion, the optimal imputation strategy with a combined reference panel should comprehensively consider genetic diversity of the study population, availability and properties of external reference panels, sequencing and computing costs, and frequency of imputed variants. This work sheds light on how to design and execute genotype imputation with a combined external reference panel in a livestock population.

Type
Research Article
Copyright
© The Animal Consortium 2018 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abecasis, GR, Auton, A, Brooks, LD, DePristo, MA, Durbin, RM, Handsaker, RE, Kang, HM, Marth, GT and McVean, GA, Genomes Project C 2012. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665.Google Scholar
Altshuler, DM, Gibbs, RA, Peltonen, L, Altshuler, DM, Gibbs, RA, Peltonen, L, Dermitzakis, E, Schaffner, SF, Yu, F, Peltonen, L, Dermitzakis, E, Bonnen, PE, Altshuler, DM, Gibbs, RA, de Bakker, PI, Deloukas, P, Gabriel, SB, Gwilliam, R, Hunt, S, Inouye, M, Jia, X, Palotie, A, Parkin, M, Whittaker, P, Yu, F, Chang, K, Hawes, A, Lewis, LR, Ren, Y, Wheeler, D, Gibbs, RA, Muzny, DM, Barnes, C, Darvishi, K, Hurles, M, Korn, JM, Kristiansson, K, Lee, C, McCarrol, SA, Nemesh, J, Dermitzakis, E, Keinan, A, Montgomery, SB, Pollack, S, Price, AL, Soranzo, N, Bonnen, PE, Gibbs, RA, Gonzaga-Jauregui, C, Keinan, A, Price, AL, Yu, F, Anttila, V, Brodeur, W, Daly, MJ, Leslie, S, McVean, G, Moutsianas, L, Nguyen, H, Schaffner, SF, Zhang, Q, Ghori, MJ, McGinnis, R, McLaren, W, Pollack, S, Price, AL, Schaffner, SF, Takeuchi, F, Grossman, SR, Shlyakhter, I, Hostetter, EB, Sabeti, PC, Adebamowo, CA, Foster, MW, Gordon, DR, Licinio, J, Manca, MC, Marshall, PA, Matsuda, I, Ngare, D, Wang, VO, Reddy, D, Rotimi, CN, Royal, CD, Sharp, RR, Zeng, C, Brooks, LD and McEwen, JE 2010. Integrating common and rare genetic variation in diverse human populations. Nature 467, 5258.Google Scholar
Auton, A, Brooks, LD, Durbin, RM, Garrison, EP, Kang, HM, Korbel, JO, Marchini, JL, McCarthy, S, McVean, GA and Abecasis, GR 2015. A global reference for human genetic variation. Nature 526, 6874.Google Scholar
Bomba, L, Walter, K and Soranzo, N 2017. The impact of rare and low-frequency genetic variants in common disease. Genome Biology 18, 77.Google Scholar
Browning, B and Browning, S 2016. Genotype imputation with millions of reference samples. American Journal of Human Genetics 98, 116126.Google Scholar
Chou, WC, Zheng, HF, Cheng, CH, Yan, H, Wang, L, Han, F, Richards, JB, Karasik, D, Kiel, DP and Hsu, YH 2016. A combined reference panel from the 1000 Genomes and UK10K projects improved rare variant imputation in European and Chinese samples. Scientific Reports 6, 39313.Google Scholar
Druet, T, Macleod, IM and Hayes, BJ 2014. Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity 112, 3947.Google Scholar
Friedenberg, SG and Meurs, KM 2016. Genotype imputation in the domestic dog. Mammalian Genome 27, 485494.Google Scholar
Frischknecht, M, Neuditschko, M, Jagannathan, V, Drogemuller, C, Tetens, J, Thaller, G, Leeb, T and Rieder, S 2014. Imputation of sequence level genotypes in the Franches-Montagnes horse breed. Genetics Selection Evolution 46, 63.Google Scholar
Grossi, DA, Brito, LF, Jafarikia, M, Schenkel, FS and Feng, Z 2018. Genotype imputation from various low-density SNP panels and its impact on accuracy of genomic breeding values in pigs. Animal 12, 2235–2245.Google Scholar
Hayes, B, Fries, R, Lund, MS, Boichard, D, Stothard, P, Veerkamp, RF, Van Tassell, C, Anderson, C, Hulsegge, I and Guldbrandtsen, B 2012. 1000 Bull Genomes Consortium Project. In Plant and Animal Genome XX Conference, 14–18 January 2012, San Diego, CA, USA.Google Scholar
Jostins, L, Morley, KI and Barrett, JC 2011. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. European Journal of Human Genetics 19, 662666.Google Scholar
Kang, JT, Zhang, P, Zollner, S and Rosenberg, NA 2015. Choosing subsamples for sequencing studies by minimizing the average distance to the closest leaf. Genetics 201, 499511.Google Scholar
Kranis, A, Gheyas, AA, Boschiero, C, Turner, F, Yu, L, Smith, S, Talbot, R, Pirani, A, Brew, F, Kaiser, P, Hocking, PM, Fife, M, Salmon, N, Fulton, J, Strom, TM, Haberer, G, Weigend, S, Preisinger, R, Gholami, M, Qanbari, S, Simianer, H, Watson, KA, Woolliams, JA and Burt, DW 2013. Development of a high density 600K SNP genotyping array for chicken. BMC Genomics 14, 59.Google Scholar
Kreinermøller, E, Medinagomez, C, Uitterlinden, AG, Rivadeneira, F and Estrada, K 2015. Improving accuracy of rare variant imputation with a two-step imputation approach. European Journal of Human Genetics 23, 395400.Google Scholar
Leeuwen, EMV, Sabo, A, Bis, JC, Huffman, JE, Manichaikul, A, Smith, AV, Feitosa, MF, Demissie, S, Joshi, PK and Duan, Q 2016. Meta-analysis of 49 549 individuals imputed with the 1000 Genomes Project reveals an exonic damaging variant in ANGPTL4 determining fasting TG levels. Journal of Medical Genetics 53, 441449.Google Scholar
Li, H and Durbin, R 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 17541760.Google Scholar
Li, H, Handsaker, B, Wysoker, A, Fennell, T, Ruan, J, Homer, N, Marth, G, Abecasis, G and Durbin, R, Genome Project Data Processing S 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 20782079.Google Scholar
Li, Y and Al, E 2009. Genotype imputation. Annual Review of Genomics & Human Genetics 10, 387406.Google Scholar
Littlejohn, MD, Tiplady, K, Fink, TA, Lehnert, K, Lopdell, T, Johnson, T, Couldrey, C, Keehan, M, Sherlock, RG, Harland, C, Scott, A, Snell, RG, Davis, SR and Spelman, RJ 2016. Sequence-based association analysis reveals an MGST1 eQTL with pleiotropic effects on bovine milk composition. Scientific Reports 6, 25376.Google Scholar
Lyimo, CM, Weigend, A, Msoffe, PL, Eding, H, Simianer, H and Weigend, S 2015. Global diversity and genetic contributions of chicken populations from African, Asian and European regions. Animal Genetics 45, 836848.Google Scholar
Manolio, TA, Collins, FS, Cox, NJ, Goldstein, DB, Hindorff, LA, Hunter, DJ, McCarthy, MI, Ramos, EM, Cardon, LR, Chakravarti, A, Cho, JH, Guttmacher, AE, Kong, A, Kruglyak, L, Mardis, E, Rotimi, CN, Slatkin, M, Valle, D, Whittemore, AS, Boehnke, M, Clark, AG, Eichler, EE, Gibson, G, Haines, JL, Mackay, TF, McCarroll, SA and Visscher, PM 2009. Finding the missing heritability of complex diseases. Nature 461, 747753.Google Scholar
Marchini, J and Howie, B 2010. Genotype imputation for genome-wide association studies. Nature Reviews Genetics 11, 499511.Google Scholar
Mathieson, I and McVean, G 2012. Differential confounding of rare and common variants in spatially structured populations. Nature Genetics 44, 243246.Google Scholar
Matsen, FAT, Gallagher, A and McCoy, CO 2013. Minimizing the average distance to a closest leaf in a phylogenetic tree. Systematic Biology 62, 824836.Google Scholar
McKenna, A, Hanna, M, Banks, E, Sivachenko, A, Cibulskis, K, Kernytsky, A, Garimella, K, Altshuler, D, Gabriel, S, Daly, M and DePristo, MA 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20, 12971303.Google Scholar
Mitt, M, Kals, M, Parn, K, Gabriel, SB, Lander, ES, Palotie, A, Ripatti, S, Morris, AP, Metspalu, A, Esko, T, Magi, R and Palta, P 2017. Improved imputation accuracy of rare and low-frequency variants using population-specific high-coverage WGS-based imputation reference panel. European Journal of Human Genetics 25, 869876.Google Scholar
Ni, GY, Strom, TM, Pausch, H, Reimer, C, Preisinger, R, Simianer, H and Erbe, M 2015. Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken. BMC Genomics 16, 112.Google Scholar
Pimentel, EC, Edel, C, Emmerling, R and Götz, KU 2015. Imputation errors bias genomic predictions. Journal of Dairy Science 98, 41314138.Google Scholar
Roshyara, NR and Scholz, M 2015. Impact of genetic similarity on imputation accuracy. BMC Genetics 16, 90.Google Scholar
Surakka, I, Sarin, AP, Ruotsalainen, SE, Durbin, R, Salomaa, V, Daly, M, Palotie, A and Ripatti, S 2016. The rate of false polymorphisms introduced when imputing genotypes from global imputation panels. BioRxiv, https://doi.org/10.1101/080770, Published online by by Cold Spring Harbor Laboratory Press 13 Octorber 2016.Google Scholar
Ulfah, M, Kawahara-Miki, R, Farajalllah, A, Muladno, M, Dorshorst, B, Martin, A and Kono, T 2016. Genetic features of red and green junglefowls and relationship with Indonesian native chickens Sumatera and Kedu Hitam. BMC Genomics 17, 320.Google Scholar
VanRaden, PM 2008. Efficient methods to compute genomic predictions. Journal of Dairy Science 91, 44144423.Google Scholar
Ye, S, Yuan, X, Lin, X, Gao, N, Luo, Y, Chen, Z, Li, J, Zhang, X and Zhang, Z 2018. Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population. Journal of Animal Science and Biotechnology 9, 30.Google Scholar
Zhang, P, Zhan, X, Rosenberg, NA and Zollner, S 2013. Genotype imputation reference panel selection using maximal phylogenetic diversity. Genetics 195, 319330.Google Scholar
Supplementary material: File

Ye et al. supplementary material

Ye et al. supplementary material 1

Download Ye et al. supplementary material(File)
File 973.8 KB