Skip to main content Accessibility help
×
Home
  • Print publication year: 2016
  • Online publication date: December 2015

1 - Tensor models: solution methods and applications

from Part I - Mathematical foundations

Summary

This chapter introduces several models and associated computational tools for tensor data analysis. In particular, we discuss: tensor principal component analysis, tensor low-rank and sparse decomposition models, and tensor co-clustering problems. Such models have a great variety of applications; examples can be found in computer vision, machine learning, image processing, statistics, and bio-informatics. For computational purposes, we present several useful tools in the context of tensor data analysis, including the alternating direction method of multipliers (ADMM), and the block variables optimization techniques. We draw on applications from the gene expression data analysis in bio-informatics to demonstrate the performance of some of the aforementioned tools.

Introduction

One rich source of big data roots is the high dimensionality of the data formats known as tensors. Specifically, a complex-valued m-dimensional or mth-order tensor (a.k.a. m-way multiarray) can be denoted by ∈ ℂnn2×…×nm, whose dimension in the ith direction is ni, i = 1, …,m. Vector and matrix are special cases of tensor when m = 1 and m = 2, respectively. In the era of big data analytics, huge-scale dense data in the form of tensors can be found in different domains such as computer vision [1], diffusion magnetic resonance imaging (MRI) [2–4], the quantum entanglement problem [5], spectral hypergraph theory [6], and higher-order Markov chains [7]. For instance, a color image can be considered as 3D data with row, column, color in each direction, while a color video sequence can be considered as 4D data, where time is the fourth dimension. Therefore, how to extract useful information from these tensor data becomes a very meaningful task.

On the other hand, the past few years have witnessed an emergence of sparse and low-rank matrix optimization models and their applications in data sciences, signal processing, machine learning, bioinformatics, and so on. There have been extensive investigations on low-rank matrix completion and recovery problems since the seminal works of [8–11]. Some important variants of sparse and low-rank matrix optimization problems such as robust principal component analysis (PCA) [12, 13] and sparse PCA [14] have also been studied. A natural extension of the matrix to higher-dimensional space is the tensor. Traditional matrix-based data analysis is inherently two-dimensional, which limits its ability in extracting information from a multi-dimensional perspective.

[1] H., Wang and N., Ahuja, “Compact representation of multidimensional data using tensor rank-one decomposition,” in Proceedings of the Pattern Recognition, 17th International Conference on ICPR, 2004.
[2] A., Ghosh, E., Tsigaridas, M., Descoteaux, P., Comon, B., Mourrain, and R., Deriche, “A polynomial based approach to extract the maxima of an antipodally symmetric spherical function and its application to extract fiber directions from the orientation distribution function in diffusion mri,” in Computational Diffusion MRI Workshop (CDMRI'08), New York, 2008.
[3] L., Bloy and R., Verma, “On computing the underlying fiber directions from the diffusion orientation distribution function,” in Medical Image Computing and Computer-Assisted Intervention, MICCAI 2008, D., Metaxas, L., Axel, G., Fichtinger and G., Szaäekeley, eds., 2008.
[4] L., Qi, G., Yu, and E. X., Wu, “Higher order positive semi-definite diffusion tensor imaging,” SIAM Journal on Imaging Sciences, pp. 416–433, 2010.
[5] J. J., Hilling and A., Sudbery, “The geometric measure of multipartite entanglement and the singular values of a hypermatrix,” J. Math. Phys., vol. 51, p. 072102, 2010.
[6] S., Hu and L., Qi, “Algebraic connectivity of an even uniform hypergraph,” Journal of Combinatorial Optimization, vol. 24, pp. 564–579, 2012.
[7] W., Li and M., Ng, “Existence and uniqueness of stationary probability vector of a transition probability tensor,” Department of Mathematics, The Hong Kong Baptist University, Tech. Rep., 2011.
[8] M., Fazel, H., Hindi, and S., Boyd, “Rank minimization and applications in system theory.” in American Control Conference, 2004, pp. 3273–3278.
[9] B., Recht, M., Fazel, and P., Parrilo, “Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization,” SIAM Review, vol. 52, no. 3, pp. 471–501, 2010.
[10] E. J., Candès and B., Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, pp. 717–772, 2009.
[11] E. J., Candès and T., Tao, “The power of convex relaxation: near-optimal matrix completion,” IEEE Trans. Inform. Theory, vol. 56, no. 5, pp. 2053–2080, 2009.
[12] E. J., Candès, X., Li, Y., Ma, and J., Wright, “Robust principal component analysis?Journal of ACM, vol. 58, no. 3, pp. 1–37, 2011.
[13] V., Chandrasekaran, S., Sanghavi, P., Parrilo, and A., Willsky, “Rank-sparsity incoherence for matrix decomposition,” SIAM Journal on Optimization, vol. 21, no. 2, pp. 572–596, 2011.
[14] A., d'Aspremont, L. E., Ghaoui, M. I., Jordan, and G. R. G., Lanckriet, “A direct formulation for sparse PCAusing semidefinite programming,” SIAM Review, vol. 49, no. 3, pp. 434–448, 2007.
[15] S., Ma, D., Johnson, C., Ashby, et al., “SPARCoC: a new framework for molecular pattern discovery and cancer gene identification,” PLoS ONE, vol. 10, no. 3, e0117135, 2015.
[16] X., Li, M., Ng, and X., Yuan, “Nuclear-norm-free variational models for background extraction from surveillance video,” Preprint, 2013.
[17] F. L., Hitchcock, The Expression of a Tensor or a Polyadic as a Sum of Products. Institute of Technology, 1927.
[18] F. L., Hitchcock, “Multiple invariants and generalized rank of a p-way matrix or tensor,” Journal of Mathematical Physics, vol. 7, no. 1, pp. 39–79, 1927.
[19] J.D., Carroll and J. J., Chang, “Analysis of individual differences in multidimensional scaling via an n-way generalization of ‘Eckart–Young’ decomposition,” Psychometrika, vol. 35, no. 3, pp. 283–319, 1970.
[20] R. A., Harshman, Foundations of the PARAFAC Procedure: Models and Conditions for an “Explanatory” Multimodal Factor Analysis. Los Angeles: University of California at Los Angeles, 1970.
[21] L., Mackey, “Deflation methods for sparse PCA,” in Advances in Neural Information Processing Systems (NIPS), 2008.
[22] L., De Lathauwer, B., De Moor, and J., Vandewalle, “On the best rank-1 and rank- (r1, r2, …, rn) approximation of higher-order tensors,” SIAM Journal on Matrix Analysis and Applications, vol. 21, no. 4, pp. 1324–1342, 2000.
[23] B., Jiang, S., Ma, and S., Zhang, “Tensor principal component analysis via convex optimization,” Mathematical Programming, vol. 150, pp. 423–457, 2015.
[24] L., Qi, “Eigenvalues of a real supersymmetric tensor,” Journal of Symbolic Computation, vol. 40, pp. 1302–1324, 2005.
[25] E., Kofidis and P. A., Regalia, “Tensor approximation and signal processing applications,” in Structured Matrices in Mathematics, Computer Science, and Engineering I, V., Olshevsky, Ed., Contemporary Mathematics Series, American Mathematical Society, 2001.
[26] L. H., Lim, “Singular values and eigenvalues of tensors: a variational approach,” in Proceedings of the IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2005.
[27] B., Chen, S., He, Z., Li, and S., Zhang, “Maximum block improvement and polynomial optimization,” SIAM Journal on Optimization, vol. 22, pp. 87–107, 2012.
[28] L., Qi, F., Wang, and Y., Wang, “Z-eigenvalue methods for a global polynomial optimization problem,” Mathematical Programming, Series A, vol. 118, pp. 301–316, 2009.
[29] E., Kofidis and P. A., Regalia, “On the best rank-1 approximation of higher-order supersymmetric tensors,” SIAM Journal on Matrix Analysis and Applications, vol. 23, pp. 863–884, 2002.
[30] T. G., Kolda and B. W., Bader, “Tensor decompositions and applications,” SIAM Review, vol. 51, pp. 455–500, 2009.
[31] T. G., Kolda and J. R., Mayo, “Shifted power method for computing tensor eigenpairs,” SIAM J. Matrix Analysis, vol. 32, pp. 1095–1124, 2011.
[32] L., Suter, L., Babiss, and E., Wheeldon, “Toxicogenomics in predictive toxicology in drug development,” Chem. Biol., vol. 11, pp. 161–171, 2004.
[33] Z., Magic, S., Radulovic, and M., Brankovic-Magic, “cdna microarrays: Identification of gene signatures and their application in clinical practice,” J. BUON, vol. 12, Suppl. 1, pp. S39–44, 2007.
[34] A., Cheung, “Molecular targets in gynaecological cancers,” Pathology, vol. 39, pp. 26–45, 2007.
[35] S., Zhang, K., Wang, B., Chen, and X., Huang, “A new framework for co-clustering of gene expression data,” in PRIB2011, ser. Lecture Notes in Bio-Informatics, M., Loog et al., Eds., Springer-Verlag, 2011, vol. 7036, pp. 1–12.
[36] S., Zhang, K., Wang, C., Ashby, B., Chen, and X., Huang, “A unified adaptive co-identification framework for high-d expression data,” in PRIB2012, ser. LectureNotes in Bio-Informatics, Springer-Verlag, 2012, vol. 7632, pp. 59–70.
[37] P., D'haeseleer, “Howdoes gene expression clusteringwork?” Nature Biotechnology, vol. 23, no. 12, pp. 1499–1502, 2005.
[38] M., Eisen, P., Spellman, P., Brown, and D., Botstein, “Cluster analysis and display of genomewide expression patterns,” Proceedings of theNational Academy of Sciences, vol. 95, no. 25, pp. 14 863–14 868, 1998.
[39] S., Tavazoie, J., Hughes, M., Campbell, R., Cho, and G., Church, “Systematic determination of genetic network architecture,” Nature Genetics, vol. 22, no. 3, pp. 281–285, 1999.
[40] P., Tamayo, D., Slonim, J., Mesirov, et al., “Interpreting patterns of gene expression with selforganizing maps: methods and application to hematopoietic differentiation,” Proceedings of the National Academy of Sciences, vol. 96, no. 6, pp. 2907–2912, 1999.
[41] Y., Cheng and G., Church, “Biclustering of expression data,” Ismb, vol. 8, pp. 93–103, 2000.
[42] A., Prelić, S., Bleuler, P., Zimmermann, et al., “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, 2006.
[43] M., Strauch, J., Supper, C., Spieth, et al., “A two-step clustering for 3-d gene expression data reveals the main features of the arabidopsis stress response,” Journal of Integrative Bioinformatics, vol. 4, no. 1, p. 54, 2007.
[44] A., Li and D., Tuck, “An effective tri-clustering algorithm combining expression data with gene regulation information.” Gene Regulation and Systems Biology, vol. 3, pp. 49–64, 2008.
[45] L., Zhao and M., Zaki, “Tricluster: an effective algorithm for mining coherent clusters in 3d microarray data,” in Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, ACM, 2005, pp. 694–705.
[46] D., Jiang, J., Pei, M., Ramanathan, C., Tang, and A., Zhang, “Mining coherent gene clusters from gene-sample-time microarray data,” in Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, 2004, pp. 430–439.
[47] M., Deodhar, J., Ghosh, G., Gupta, H., Cho, and I., Dhillon, “Hunting for coherent co-clusters in high dimensional and noisy datasets,” in Data Mining Workshops, 2008. ICDMW'08. IEEE International Conference on, IEEE, 2008, pp. 654–663.
[48] A., Banerjee, I., Dhillon, J., Ghosh, S., Merugu, and D., Modha, “A generalized maximum entropy approach to bregman co-clustering and matrix approximation,” Journal of Machine Learning Research, vol. 8, pp. 1919–1986, 2007.
[49] I., Costa, F., de Carvalho, and M., de Souto, “Comparative analysis of clustering methods for gene expression time course data,” Genetics and Molecular Biology, vol. 27, no. 4, pp. 623–631, 2004.
[50] F., Gibbons and F., Roth, “Judging the quality of gene expression-based clustering methods using gene annotation,” Genome research, vol. 12, no. 10, pp. 1574–1581, 2002.
[51] J., Håstad, “Tensor rank is NP-complete,” J. Algorithms, vol. 11, pp. 644–654, 1990.
[52] J. B., Kruskal, Multiway Data Analysis, North-Holland, Amsterdam, 1989, ch. Rank, Decomposition, and Uniqueness for 3-way and N-way Arrays, pp. 7–18.
[53] J., Liu, P., Musialski, P., Wonka, and J., Ye, “Tensor completion for estimating missing values in visual data,” in The Twelfth IEEE International Conference on Computer Vision, 2009.
[54] S., Gandy, B., Recht, and I., Yamada, “Tensor completion and low-n-rank tensor recovery via convex optimization,” Inverse Problems, vol. 27, no. 2, p. 025010, 2011.
[55] R., Tomioka, K., Hayashi, and H., Kashima, “Estimation of low-rank tensors via convex optimization,” preprint, 2011.
[56] D., Goldfarb and Z., Qin, “Robust low-rank tensor recovery: models and algorithms,” preprint, 2013.
[57] R., Tomioka, T., Suzuki, and K., Hayashi, “Statistical performance of convex tensor decomposition,” in NIPS, 2011.
[58] M., Signoretto, R., Van de Plas, B., De Moor, and J., Suykens, “Tensor versus matrix completion: a comparison with application to spectral data,” IEEE Signal Processing Letters, vol. 18, no. 7, pp. 403–406, 2011.
[59] C., Mu, B., Huang, J., Wright, and D., Goldfarb, “Square deal: lower bounds and improved relaxations for tensor recovery,” preprint, 2013.
[60] D., Kressner, M., Steinlechner, and B., Vandereycken, “Low-rank tensor completion by Riemannian optimization,” preprint, 2013.
[61] A., Krishnamurthy and A., Singh, “Low-rank matrix and tensor completion via adaptive sampling,” preprint, 2013.
[62] E. J., Candès, J., Romberg, and T., Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, pp. 489–509, 2006.
[63] D., Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, pp. 1289–1306, 2006.
[64] F., Alizadeh, “Interior point methods in semidefinite programming with applications to combinatorial optimization,” SIAM Journal on Optimization, vol. 5, pp. 13–51, 1993.
[65] M. X., Goemans and D. P., Williamson, “Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming,” J. Assoc. Comput. Mach., vol. 42, no. 6, pp. 1115–1145, 1995.
[66] L., Vandenberghe and S., Boyd, “Semidefinite programming,” SIAM Rev., vol. 38, no. 1, pp. 49–95, 1996.
[67] M., Grant and S., Boyd, “CVX: Matlab software for disciplined convex programming, version 1.21,” http://cvxr.com/cvx, 2010.
[68] J., Douglas and H. H., Rachford, “On the numerical solution of the heat conduction problem in 2 and 3 space variables,” Transactions of the American Mathematical Society, vol. 82, pp. 421–439, 1956.
[69] D. H., Peaceman and H. H., Rachford, “The numerical solution of parabolic elliptic differential equations,” SIAM Journal on Applied Mathematics, vol. 3, pp. 28–41, 1955.
[70] P., L. Lions and B., Mercier, “Splitting algorithms for the sum of two nonlinear operators,” SIAM Journal on Numerical Analysis, vol. 16, pp. 964–979, 1979.
[71] M., Fortin and R., Glowinski, Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems, North-Holland Pub. Co., 1983.
[72] R., Glowinski and P., Le Tallec, Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics, Philadelphia, Pennsylvania: SIAM, 1989.
[73] J., Eckstein, “Splitting methods for monotone operators with applications to parallel optimization,” Ph.D. dissertation, Massachusetts Institute of Technology, 1989.
[74] J., Eckstein and D. P., Bertsekas, “On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators,” Mathematical Programming, vol. 55, pp. 293–318, 1992.
[75] D., Gabay, “Applications of the method of multipliers to variational inequalities,” in Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems, M.|Fortin and R.|Glowinski, Eds., Amsterdam: North-Holland, 1983.
[76] J., Yang and Y., Zhang, “Alternating direction algorithms for l1 problems in compressive sensing,” SIAM Journal on Scientific Computing, vol. 33, no. 1, pp. 250–278, 2011.
[77] Y., Wang, J., Yang, W., Yin, and Y., Zhang, “A new alternating minimization algorithm for total variation image reconstruction,” SIAM Journal on Imaging Sciences, vol. 1, no. 3, pp. 248–272, 2008.
[78] T., Goldstein and S., Osher, “The split Bregmanmethod for L1-regularized problems,” SIAM J. Imaging Sci., vol. 2, pp. 323–343, 2009.
[79] M., Tao and X., Yuan, “Recovering low-rank and sparse components of matrices from incomplete and noisy observations,” SIAM J. Optim., vol. 21, pp. 57–81, 2011.
[80] X., Yuan, “Alternating direction methods for sparse covariance selection,” Journal of Scientific Computing, vol. 51, pp. 261–273, 2012.
[81] K., Scheinberg, S., Ma, and D., Goldfarb, “Sparse inverse covariance selection via alternating linearization methods,” in NIPS, 2010.
[82] S., Ma, “Alternating direction method of multipliers for sparse principal component analysis,” Journal of the Operations Research Society of China, vol. 1, no. 2, pp. 253–274, 2013.
[83] Z., Wen, D., Goldfarb, and W., Yin, “Alternating direction augmented Lagrangian methods for semidefinite programming,” Mathematical Programming Computation, vol. 2, pp. 203–230, 2010.
[84] S., Boyd, N., Parikh, E., Chu, B., Peleato, and J., Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011.
[85] S., Ma, D., Goldfarb, and L., Chen, “Fixed point and Bregman iterative methods for matrix rank minimization,” Mathematical Programming Series A, vol. 128, pp. 321–353, 2011.
[86] D. P., Bertsekas, Nonlinear Programming, 2nd Edn, Belmont, Massachusetts: Athena Scientific, 1999.
[87] P., Tseng, “Convergence of a block coordinate descent method for nondifferentiable minimization,” J. Optim. Theory Appl., vol. 109, no. 3, pp. 475–494, 2001.
[88] D. P., Bertsekas and J. N., Tsitsiklis, Parallel and Distributed Computation: Numerical Methods, Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1989.
[89] L., Grippo and M., Sciandrone, “On the convergence of the block nonlinear Gauss–Seidel method under convex constraints,” Oper. Res. Lett., vol. 26, no. 3, pp. 127–136, 2000.
[90] Z. Q., Luo and P., Tseng, “On the convergence of the coordinate descent method for convex differentiable minimization,” J. Optim. Theory Appl., vol. 72, no. 1, pp. 7–35, 1992.
[91] Z.-Q., Luo and P., Tseng, “On the linear convergence of descent methods for convex essentially smooth minimization,” SIAM Journal on Control and Optimization, vol. 30, no. 2, pp. 408–425, 1992.
[92] Z.-Q., Luo and P., Tseng, “Error bounds and convergence analysis of feasible descent methods: a general approach,” Annals of Operations Research, vol. 46, no. 1, pp. 157–178, 1993.
[93] A., Beck and L., Tetruashvili, “On the convergence of block coordinate descent type methods,” SIAM Journal on Optimization, vol. 23, no. 4, pp. 2037–2060, 2013.
[94] M., Hong, X., Wang, M., Razaviyayn, and Z.-Q., Luo, “Iteration complexity analysis of block coordinate descent methods,” arXiv preprint arXiv:1310.6957, 2013.
[95] Z., Li, A., Uschmajew, and S., Zhang, “On convergence of the maximum block improvement method,” to appear in SIAM Journal on Optimization, 2013.
[96] A. A., Alizadeh et al., “Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling,” Nature, vol. 403, no. 6769, pp. 503–511, 2000.
[97] J., Kilian, D., Whitehead, J., Horak, et al., “The AtGenExpress global stress expression data set: protocols, evaluation and model data analysis of UV-b light, drought and cold stress responses,” The Plant Journal, vol. 50, no. 2, pp. 347–363, 2007.
[98] J., Supper, M., Strauch, D., Wanke, K., Harter, and A., Zell, “Edisa: extracting biclusters from multiple time-series of gene expression profiles,” BMC Bioinformatics, vol. 8, no. 1, pp. 334–347, 2007.
[99] H., Cho, I. S., Dhillon, Y., Guan, and S., Sra, “Minimum sum-squared residue co-clustering of gene expression data.” in Proceedings of The Fourth SIAM International Conference on Data Mining, vol. 3, SIAM, 2004, pp. 114–125.
[100] T., Li, H.-J., Kung, P. C., Mack, and D. R., Gandara, “Genotyping and genomic profiling of non–small-cell lung cancer: Implications for current and future therapies,” Journal of Clinical Oncology, vol. 31, no. 8, pp. 1039–1049, 2013.
[101] L., West, S. J., Vidwans, N. P., Campbell, et al., “A novel classification of lung cancer into molecular subtypes,” PloS one, vol. 7, no. 2, p. e31906, 2012.
[102] K., Shedden et al., “Gene expression–based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study,” Nature Medicine, vol. 14, no. 8, pp. 822–827, 2008.