In each day of our modern world, quintillions of bytes of data are generated. This rate keeps increasing, far outpacing the rate at which we can upgrade the computing systems. In fact, a 2011 study by IDC estimates that the amount of data available in the world doubles every two years. This rate of growth closely follows Moore's Law. If we cannot fully understand the issues involved and invent new data processing methods, society will soon be flooded with data, big data. Big data sets can exist within a single entity; but most such sets are distributed and can only be aggregated through some type of network. A bigger challenge is that the big data dynamics and the underlying network dynamics are almost always highly correlated. Therefore, understanding the interplay between big data and the associated networks is a critical step in our effort to tackle big data. However, to date, we do not have a systematic theory with which to study the problem thoroughly. Even worse, in some cases, we do not even know how to formulate and approach those problems. We are in need of a comprehensive book to survey and cover both the critical mathematical tools and the state of the art in related research fields.
This book focuses on large-scale information processing over networks, where the meaning of the term information processing can refer to data processing, data storage, or information retrieval, and the term networks may refer to cyber networks, social networks, or biological networks. We take three complementary angles to study the interaction between the data and the underlying network connections. First, we address ways that the underlying network can constrain the upper-layer collaborative big data processing; second, we show how certain big data processing perspectives can help boost the performance in various networks; third, we address the fundamental limits that govern statistical and computational bottlenecks in the analysis of big data. The book consists of chapters contributed by experts from diverse fields spanning machine learning, optimization, statistics, signal processing, networking, communications, sociology, and biology. The core unifying theme of the book is the rigorous mathematical treatment of various subjects, enriched by in-depth discussions of future directions at the end of each chapter.