Skip to main content Accessibility help
×
Home
  • Print publication year: 2016
  • Online publication date: December 2015

13 - Inference of gene networks associated with the host response to infectious disease

from Part IV - Big data over biological networks

Summary

Inspired by the problem of inferring gene networks associated with the host response to infectious diseases, a new framework for discriminative factor models is developed. Bayesian shrinkage priors are employed to impose (near) sparsity on the factor loadings, while non-parametric techniques are utilized to infer the number of factors needed to represent the data. Two discriminative Bayesian loss functions are investigated, i.e. the logistic log-loss and the max-margin hinge loss. Efficient mean-field variational Bayesian inference and Gibbs sampling are implemented. To address large-scale datasets, an online version of variational Bayes is also developed. Experimental results on two real world microarray-based gene expression datasets show that the proposed framework achieves comparatively superior classification performance, with model interpretation delivered via pathway association analysis.

Background

From a statistical-modeling perspective, gene expression analysis can be roughly divided into two phases: exploration and prediction. In the former, the practitioner attempts to get a general understanding of a dataset by modeling its variability in an interpretable way, such that the inferred model can serve as a feature extractor and hypotheses generating mechanism of the underlying biological processes. Factor models are among the most widely employed techniques for exploratory gene expression analysis [1, 2], with principal component analysis a popular special case [3]. Predictive modeling, on the other hand, is concerned with finding a relationship between gene expression and phenotypes, that can be generalized to unseen samples. Examples of predictive models include classification methods like logistic regression and support vector machines [4, 5].

Factor models infer a latent covariance structure among the genes or biomarkers, with data modeled as generated from a noisy low-rank matrix factorization, manifested in terms of a loadings matrix and a factor scores matrix. Different specifications for these matrices give rise to special cases of factor models, such as principal components analysis [6], nonnegative matrix factorization [7], independent component analysis [8], and sparse factor models [1]. Factor models employing a sparse factor loadings matrix are of significant interest in gene-expression analysis, as the nonzero elements in the loadings matrix may be interpreted as correlated gene networks [1, 2, 9].

[1] C. M., Carvalho, J., Chang, J. E., Lucas, et al., “High-dimensional sparse factor modeling: applications in gene expression genomics,” Journal of the American Statistical Association, vol. 103, no. 484, pp. 1438–1456, 2008.
[2] J., Lucas, C., Carvalho, and M., West, “A Bayesian analysis strategy for cross-study translation of gene expression biomarkers,” Statistical Applications in Genetics and Molecular Biology, vol. 8, no. 1, pp. 1–26, 2009.
[3] T., Speed, Statistical Analysis of Gene Expression Microarray Data, CRC Press, 2003.
[4] S., Dudoit, J., Fridlyand, and T. P., Speed, “Comparison of discrimination methods for the classification of tumors using gene expression data,” Journal of the American Statistical Association, vol. 97, no. 457, pp. 77–87, 2002.
[5] I., Guyon, J., Weston, S., arnhill, and V., Vapnik, “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, pp. 389–422, 2002.
[6] I., Jolliffe, Principal Component Analysis, Wiley Online Library, 2005.
[7] D. D., Lee and H. S., Seung, “Algorithms for non-negative matrix factorization,” in Advances in Neural Information Processing Systems, 2001.
[8] A., Hyvärinen, J., Karhunen, and E., Oja, Independent Component Analysis, John Wiley & Sons, 2004.
[9] L., Carin, J. L., Alfred Hero III, D., Dunson, et al., “High-dimensional longitudinal genomic data: an analysis used for monitoring viral infections,” IEEE Signal Processing Magazine, vol. 29, no. 1, pp. 108–123, 2012.
[10] M. E., Tipping, “Sparse Bayesian learning and the relevance vector machine,” The Journal of Machine Learning Research, vol. 1, pp. 211–244, 2001.
[11] B., Krishnapuram, D., Williams, Y., Xue, et al., “On semi-supervised classification,” in Advances in Neural Information Processing Systems, 2004.
[12] C. M., Bishop et al., Pattern Recognition and Machine Learning, Springer, New York, 2006.
[13] J., Bernardo, M., Bayarri, J., Berger, et al., “Bayesian factor regression models in the ‘large p, small n’ paradigm,” Bayesian Statistics, vol. 7, pp. 733–742, 2003.
[14] A. M., Kagan, C. R., Rao, and Y. V., Linnik, Characterization Problems in Mathematical Statistics, Wiley, 1973.
[15] H., Ishwaran and L. F., James, “Gibbs sampling methods for stick-breaking priors,” Journal of the American Statistical Association, vol. 96, no. 453, pp. 1–23, 2001.
[16] R., Henao and O., Winther, “Sparse linear identifiable multivariate modeling,” The Journal of Machine Learning Research, vol. 12, pp. 863–905, 2011.
[17] T., Park and G., Casella, “The Bayesian lasso,” Journal of the American Statistical Association, vol. 103, no. 482, pp. 681–686, 2008.
[18] C. M., Carvalho, N. G., Polson, and J. G., Scott, “Handling sparsity via the horseshoe,” in International Conference on Artificial Intelligence and Statistics, 2009, pp. 73–80.
[19] N. G., Polson and J. G., Scott, “Shrink globally, act locally: sparse Bayesian regularization and prediction,” Bayesian Statistics, vol. 9, pp. 501–538, 2010.
[20] A., Armagan, D. B., Dunson, and M., Clyde, “Generalized beta mixtures of Gaussians,” in Advances in Neural Information Processing Systems, 2011.
[21] A. K., Zaas, M., Chen, J., Varkey, et al., “Gene expression signatures diagnose influenza and other symptomatic respiratory viral infections in humans,” Cell Host & Microbe, vol. 6, no. 3, pp. 207–217, 2009.
[22] B., Chen, M., Chen, J., Paisley, et al., “Bayesian inference of the number of factors in geneexpression analysis: application to human virus challenge studies,” BMC Bioinformatics, vol. 11, no. 1, pp. 1–16, 2010.
[23] M., Chen, D., Carlson, A., Zaas, et al., “Detection of viruses via statistical gene expression analysis,” Biomedical Engineering, IEEE Transactions on, vol. 58, no. 3, pp. 468–479, 2011.
[24] C.W., Woods, M. T., McClain, M., Chen, et al., “A host transcriptional signature for presymptomatic detection of infection in humans exposed to influenza H1N1 or H3N2,” PloS One, vol. 8, no. 1, pp. e52 198, 1–9, 2013.
[25] R., Henao, J. W., Thompson, M. A., Moseley, et al., “Latent protein trees,” Annals of Applied Statistics, vol. 7, no. 2, pp. 691–713, 2013.
[26] A. K., Zaas, B. H., Garner, E. L., Tsalik, et al., “The current epidemiology and clinical decisions surrounding acute respiratory infections,” Trends in Molecular Medicine, vol. 20, no. 10, pp. 579–588, 2014.
[27] J., Mairal, F., Bach, J., Ponce, G., Sapiro, and A., Zisserman, “Supervised dictionary learning,” in Advances in Neural Information Processing Systems, 2009.
[28] J. H., Albert and S., Chib, “Bayesian analysis of binary and polychotomous response data,” Journal of the American Statistical Association, vol. 88, no. 422, pp. 669–679, 1993.
[29] N. G., Polson, J. G., Scott, and J., Windle, “Bayesian inference for logistic models using Pólya–gamma latent variables,” Journal of the American Statistical Association, vol. 108, no. 504, pp. 1339–1349, 2013.
[30] N. G., Polson and S. L., Scott, “Data augmentation for support vector machines,” Bayesian Analysis, vol. 6, no. 1, pp. 1–23, 2011.
[31] N., Quadrianto, V., Sharmanska, D. A., Knowles, and Z., Ghahramani, “The supervised IBP: neighbourhood preserving infinite latent feature models.” in Uncertainty in Artificial Intelligence, 2013.
[32] E., Salazar, M. S., Cain, S. R., Mitroff, and L., Carin, “Inferring latent structure from mixed real and categorical relational data,” in International Conference on Machine Learning, 2012.
[33] C., Hans, “Bayesian lasso regression,” Biometrika, vol. 96, no. 4, pp. 835–845, 2009.
[34] J. O., Berger and W. E., Strawderman, “Choice of hierarchical priors: admissibility in estimation of normal means,” The Annals of Statistics, pp. 931–951, 1996.
[35] J. E., Griffin and P. J., Brown, “Bayesian adaptive lassos with non-convex penalization,” University of Warwick. Centre for Research in Statistical Methodology, Tech. Rep., 2007.
[36] N. G., Polson and J. G., Scott, “Local shrinkage rules, Lévy processes and regularized regression,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 74, no. 2, pp. 287–311, 2012.
[37] T. L., Griffiths and Z., Ghahramani, “Infinite latent feature models and the Indian buffet process,” in Advances in Neural Information Processing Systems, 2005.
[38] A., Bhattacharya and D. B., Dunson, “Sparse Bayesian infinite factor models,” Biometrika, vol. 98, no. 2, pp. 291–306, 2011.
[39] H. F., Lopes and M., West, “Bayesian model assessment in factor analysis,” Statistica Sinica, vol. 14, no. 1, pp. 41–68, 2004.
[40] J., Paisley and L., Carin, “Nonparametric factor analysis with beta process priors,” in International Conference on Machine Learning, 2009.
[41] X. Zhang and L. Carin, “Joint modeling of a matrix with associated text via latent binary features,” in Advances in Neural Information Processing Systems, 2012.
[42] C., Cortes and V., Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[43] D. F., Andrews and C. L., Mallows, “Scale mixtures of normal distributions,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 99–102, 1974.
[44] N. G., Polson, S. L., Scott et al., “Data augmentation for support vector machines,” Bayesian Analysis, vol. 6, no. 1, pp. 1–23, 2011.
[45] Y., Xue, X., Liao, L., Carin, and B., Krishnapuram, “Multi-task learning for classification with Dirichlet process priors,” The Journal of Machine Learning Research, vol. 8, pp. 35–63, 2007.
[46] S., Ji, D., Dunson, and L., Carin, “Multitask compressive sensing,” IEEE Transactions on Signal Processing, vol. 57, no. 1, pp. 92–106, 2009.
[47] S., Geman and D., Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 6, no. 6, pp. 721–741, 1984.
[48] C., Andrieu, N., De Freitas, A., Doucet, and M. I., Jordan, “An introduction to MCMC for machine learning,” Machine Learning, vol. 50, pp. 5–43, 2003.
[49] M. J., Beal, “Variational algorithms for approximate Bayesian inference,” Ph.D. dissertation, University College London, 2003.
[50] S., Kullback and R. A., Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79–86, 1951.
[51] M., Hoffman, F. R., Bach, and D. M., Blei, “Online learning for latent Dirichlet allocation,” in Advances in Neural Information Processing Systems, 2010.
[52] M. D., Hoffman, D. M., Blei, C., Wang, and J., Paisley, “Stochastic variational inference,” The Journal of Machine Learning Research, vol. 14, no. 1, pp. 1303–1347, 2013.
[53] Z., Wu, R. A., Irizarry, R., Gentleman, F., Martinez-Murillo, and F., Spencer, “A model-based background adjustment for oligonucleotide expression arrays,” Journal of the American Statistical Association, vol. 99, no. 468, pp. 909–917, 2004.
[54] S. T., Anderson, M., Kaforou, A. J., Brent, et al., “Diagnosis of childhood tuberculosis and host RNA expression in Africa,” New England Journal of Medicine, vol. 370, no. 18, pp. 1712–1723, 2014.
[55] T., Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006.
[56] B. T. S., Da Wei Huang and R. A., Lempicki, “Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources,” Nature Protocols, vol. 4, no. 1, pp. 44–57, 2008.
[57] Y., Benjamini and Y., Hochberg, “Controlling the false discovery rate: a practical and powerful approach to multiple testing,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 289–300, 1995.
[58] R., Henao, X., Yuan, and L., Carin, “Bayesian nonlinear support vector machines and discriminative factor modeling,” in Advances in Neural Information Processing Systems, 2014.
[59] X., Yuan, R., Henao, E. L., Tsalik, R. J., Longley, and L., Carin, “Non-Gaussian discriminative factor models via the max-margin rank-likelihood,” in International Conference on Machine Learning, 2015.