This chapter introduces several models and associated computational tools for tensor data analysis. In particular, we discuss: tensor principal component analysis, tensor low-rank and sparse decomposition models, and tensor co-clustering problems. Such models have a great variety of applications; examples can be found in computer vision, machine learning, image processing, statistics, and bio-informatics. For computational purposes, we present several useful tools in the context of tensor data analysis, including the alternating direction method of multipliers (ADMM), and the block variables optimization techniques. We draw on applications from the gene expression data analysis in bio-informatics to demonstrate the performance of some of the aforementioned tools.
Introduction
One rich source of big data roots is the high dimensionality of the data formats known as tensors. Specifically, a complex-valued m-dimensional or mth-order tensor (a.k.a. m-way multiarray) can be denoted by ℱ ∈ ℂn1×n2×…×nm, whose dimension in the ith direction is ni, i = 1, …,m. Vector and matrix are special cases of tensor when m = 1 and m = 2, respectively. In the era of big data analytics, huge-scale dense data in the form of tensors can be found in different domains such as computer vision [1], diffusion magnetic resonance imaging (MRI) [2–4], the quantum entanglement problem [5], spectral hypergraph theory [6], and higher-order Markov chains [7]. For instance, a color image can be considered as 3D data with row, column, color in each direction, while a color video sequence can be considered as 4D data, where time is the fourth dimension. Therefore, how to extract useful information from these tensor data becomes a very meaningful task.
On the other hand, the past few years have witnessed an emergence of sparse and low-rank matrix optimization models and their applications in data sciences, signal processing, machine learning, bioinformatics, and so on. There have been extensive investigations on low-rank matrix completion and recovery problems since the seminal works of [8–11]. Some important variants of sparse and low-rank matrix optimization problems such as robust principal component analysis (PCA) [12, 13] and sparse PCA [14] have also been studied. A natural extension of the matrix to higher-dimensional space is the tensor. Traditional matrix-based data analysis is inherently two-dimensional, which limits its ability in extracting information from a multi-dimensional perspective.