Dempster (1958, 1960) proposed a non-exact test for the two-sample significance test when the dimension of data is larger than the degrees of freedom. He raised the question of what statisticians should do if traditional multivariate statistical theory does not apply when the dimension of data is too large. Later, Bai and Saranadasa (1996) found that even when traditional approaches can be applied, they are much less powerful than the non-exact test when the dimension of data is large. This raised another question of how classical multivariate statistical procedures could be adapted and improved when the data dimension is large. These problems have attracted considerable attention since the middle of the first decade of this century. Efforts towards solving these problems have been made along two directions: the first is to propose special statistical procedures to solve ad hoc large-dimensional statistical problems where traditional multivariate statistical procedures are inapplicable or perform poorly, for some specific large-dimensional hypotheses. The family of various non-exact tests follows this approach. The second direction, following the work of Bai et al. (2009a), is to make systematic corrections to the classical multivariate statistical procedures so that the effect of large dimension is overcome. This goal is achieved by employing new and powerful asymptotic tools borrowed from the theory of random matrices, such as the central limit theorems in Bai and Silverstein (2004) and Zheng (2012).
Recently, research along these two directions has become very active in response to an increasingly important need for analysis of massive and large-dimensional data. Indeed, such “big data” are nowadays routinely collected owing to rapid advances in computer-based or web-based commerce and data-collection technology.
To accommodate such need, this monograph collects existing results along the aforementioned second direction of large-dimensional data analysis. In Chapters 2 and 3, the core of fundamental results from random matrix theory about sample covariance matrices and random Fisher matrices is presented in detail. Chapters 4–12 collect large-dimensional statistical problems in which the classical large sample methods fail and the new asymptotic methods, based on the fundamental results of the preceding chapters, provide a valuable remedy.