Hostname: page-component-848d4c4894-r5zm4 Total loading time: 0 Render date: 2024-06-22T21:33:42.874Z Has data issue: false hasContentIssue false

Comparison of optimization methods for core subset selection from a large collection of Mexican wheat landraces characterized by SNP markers

Published online by Cambridge University Press:  25 September 2017

Carlos L. Acuña-Matamoros
Affiliation:
Departamento de Fitomejoramiento, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315, Saltillo, Coah., Mexico
M. Humberto Reyes-Valdés*
Affiliation:
Departamento de Fitomejoramiento, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315, Saltillo, Coah., Mexico
*
*Corresponding author. E-mail: mathgenome@gmail.com

Abstract

Core subset selection from collections hosted by seed banks, grow in importance as the number of accessions and genetic marker information rapidly increases. A data set of 20,526 single-nucleotide polymorphism (SNP) markers characterizing 7986 Mexican creole wheat landraces, was used to test 11 methods for core subset selection, through optimization criteria containing average genetic distance and genetic diversity. Allele richness was used as an additional criterion to qualify the generated core subsets. Three replications with random samples of 1500 SNP loci, each comprising a maximum of 3000 alleles, were used to perform the method evaluations through four different objective functions. The LR greedy search (LR) and LR with random first pair (LRSemi) were consistently best across all assays for maximizing the objective functions, and they performed well even for criteria not included in those functions. The Tukey's HSD (honest significant difference) multiple comparisons grouped those methods together with the sequential forward selection (SFS) and SFS with random first pair (SFSSemi) strategies as the top set of approaches. All of them are simple heuristic maximization algorithms, and outperformed two more sophisticated optimization approaches: parallel mixed replica exchange and replica exchange Monte Carlo. For their efficiency to optimize the objective functions and computing speed, the LRSemi and SFSSemi methods demonstrated to be good alternatives for core subset selection from large collections of highly homozygous accessions characterized by many biallelic markers.

Type
Research Article
Copyright
Copyright © NIAB 2017 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

De Beukelaer, HD, Smýkal, P, Davenport, GF and Fack, V (2012) Core Hunter II: fast core subset selection based on multiple genetic diversity measures using Mixed Replica search. BMC Bioinformatics 13: 312.Google Scholar
Franco, J, Crossa, J, Villaseñor, J, Taba, S and Eberhart, SA (1998) Classifying genetic resources by categorical and continuous variables. Crop Science 38: 16881696.Google Scholar
Franco, J, Crossa, J, Taba, S and Shands, H (2005) A sampling strategy for conserving genetic diversity when forming core subsets using genetic markers. Crop Science 46: 854864.Google Scholar
Frankel, OH and Brown, AHD (1984) Plant genetic resources today: a critical appraisal. In Holden, JHW and Williams, JT (eds) Crop Genetic Resources: Conservation and Evaluation. London: George Allen and Unwin, pp. 249257.Google Scholar
Geyer, CJ (1991) Markov chain Monte Carlo maximum likelihood. In Keramidas, (ed.) Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface. Interface Foundation: Fairfax Station, pp. 156163.Google Scholar
Goodman, MM and Stuber, CW (1983) Races of maize: vI. Isozyme variation among races of maize in Bolivia. Maydica 28: 169187.Google Scholar
Gouesnard, B, Bataillon, TM, Decoux, G, Rozale, C, Schoen, DJ and David, JL (2001) MSTRAT: an algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness. The Journal of Heredity 92: 9394.Google Scholar
Govindaraj, M, Vetriventhan, M and Srinivasan, M (2015) Importance of genetic diversity assessment in crop plants and its recent advances: an overview of its analytical perspectives. Genetics Research International 2015: 14.Google Scholar
Iba, Y (2001) Extended ensemble monte carlo. International Journal of Modern Physics C 12: 623656.Google Scholar
Kim, KW, Chung, HK, Cho, GT, Ma, KH, Chandrabalan, D, Gwag, JG, Kim, TS, Cho, EG and Pak, YJ (2007) Powercore: a program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics 23: 21552162.Google Scholar
Kimura, K and Taki, K (1991) Time-homogeneous parallel annealing algorithm. In Vichneetsky, R and Miller, JJH (eds.) Proceedings of the 13th IMACS World Congress on Computation and Applied Mathematics (IMACS'91), vol. 2. Dublin, Ireland: International Association for Mathematics and Computer Simulation, pp. 827828.Google Scholar
R Core Team (2016) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Available at https://www.R-project.org/ (Accessed January 2016).Google Scholar
Reyes-Valdes, MH (2013) Informativeness of microsatellite markers. In: Kantartzi, SK (ed.) Microsatellites. Methods in molecular biology (Methods and Protocols), vol. 1006. Totowa NJ, USA: Humana Press, pp. 257270.Google Scholar
Schoen, DJ and Brown, AHD (1993) Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers. Proceedings of the National Academy of Sciences of the United States of America 90: 1062310627.Google Scholar
Shannon, CE (1948) A mathematical theory of communication. The Bell System Technical Journal 27: 623656.Google Scholar
Singh, S, Sansaloni, C, Petroli, C, Ellis, M and Kilian, A (2014) DArTseq-derived SNPs for wheat Mexican landrace accessions International Maize and Wheat Improvement Center (CIMMYT). Available at http://hdl.handle.net/11529/10013 (Accessed September 2015).Google Scholar
Thachuk, C, Crossa, J, Franco, J, Dreisigacker, S, Warburton, M and Davenport, GF (2009) Core Hunter: an algorithm for sampling genetic resources based on multiple genetic measures. BMC Bioinformatics 10: 243.Google Scholar
Vikram, P, Franco, J, Burgueño-Ferreira, J, Li, H, Sehgal, D, Saint Pierre, C, Ortiz, C, Sneller, C, Tattaris, M, Guzman, C, Sansaloni, CP, Ellis, M, Fuentes-Davila, G, Reynolds, M, Sonder, K, Singh, P, Payne, T, Wenzl, P, Sharma, A, Bains, NS, Singh, GP, Crossa, J and Singh, S (2016) Unlocking the genetic diversity of Creole wheats. Scientific Reports 6: 23092.Google Scholar
Supplementary material: PDF

Acuña-Matamoros and Reyes-Valdés supplementary material

Tables S1-S2

Download Acuña-Matamoros and Reyes-Valdés supplementary material(PDF)
PDF 92.1 KB