We cannot overstate the importance of Shannon's contribution to modern science. His introduction of the field of information theory and his solutions to its two main theorems demonstrate that his ideas on communication were far beyond the other prevailing ideas in this domain around 1948.
In this chapter, our aim is to discuss Shannon's two main contributions in a descriptive fashion. The goal of this high-level discussion is to build up the intuition for the problem domain of information theory and to understand the main concepts before we delve into the analogous quantum information-theoretic ideas. We avoid going into deep technical detail in this chapter, leaving such details for later chapters where we formally prove both classical and quantum Shannontheoretic coding theorems. We do use some mathematics from probability theory (namely, the law of large numbers).
We will be delving into the technical details of this chapter's material in later chapters (specifically, Chapters 10, 13, and 14). Once you have reached later chapters that develop some more technical details, it might be helpful to turn back to this chapter to get an overall flavor for the motivation of the development.
Data Compression
We first discuss the problem of data compression. Those who are familiar with the Internet have used several popular data formats such as JPEG, MPEG, ZIP, GIF, etc. All of these file formats have corresponding algorithms for compressing the output of an information source. A first glance at the compression problem might lead one to believe that it is not possible to compress the output of the information source to an arbitrarily small size, and Shannon proved that this is the case. This result is the content of Shannon's first noiseless coding theorem.
An Example of Data Compression
We begin with a simple example that illustrates the concept of an information source. We then develop a scheme for coding this source so that it requires fewer bits to represent its output faithfully.
Suppose that Alice is a sender and Bob is a receiver. Suppose further that a noiseless bit channel connects Alice to Bob—a noiseless bit channel is one that transmits information perfectly from sender to receiver, e.g., Bob receives “0” if Alice transmits “0” and Bob receives “1” if Alice transmits “1.”