We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In this chapter, we look at the moments of a random variable. Specifically we demonstrate that moments capture useful information about the tail of a random variable while often being simpler to compute or at least bound. Several well-known inequalities quantify this intuition. Although they are straightforward to derive, such inequalities are surprisingly powerful. Through a range of applications, we illustrate the utility of controlling the tail of a random variable, typically by allowing one to dismiss certain “bad events” as rare. We begin by recalling the classical Markov and Chebyshev’s inequalities. Then we discuss three of the most fundamental tools in discrete probability and probabilistic combinatorics. First, we derive the complementary first and second moment methods, and give several standard applications, especially to threshold phenomena in random graphs and percolation. Then we develop the Chernoff–Cramer method, which relies on the “exponential moment” and is the building block for large deviations bounds. Two key applications in data science are briefly introduced: sparse recovery and empirical risk minimization.
In this chapter, we move on to coupling, another probabilistic technique with a wide range of applications (far beyond discrete stochastic processes). The idea behind the coupling method is deceptively simple: to compare two probability measures, it is sometimes useful to construct a joint probability space with the corresponding marginals. We begin by defining coupling formally and deriving its connection to the total variation distance through the coupling inequality. We illustrate the basic idea on a classical Poisson approximation result, which we apply to the degree sequence of an Erdos–Renyi graph. Then we introduce the concept of stochastic domination and some related correlation inequalities. We develop a key application in percolation theory. Coupling of Markov chains is the next topic, where it serves as a powerful tool to derive mixing time bounds. Finally, we end with the Chen–Stein method for Poisson approximation, a technique that applies in particular in some natural settings with dependent variables.
In this chapter, we develop spectral techniques. We highlight some applications to Markov chain mixing and network analysis. The main tools are the spectral theorem and the variational characterization of eigenvalues, which we review together with some related results. We also give a brief introduction to spectral graph theory and detail an application to community recovery. Then we apply the spectral theorem to reversible Markov chains. In particular we define the spectral gap and establish its close relationship to the mixing time. We also show in that the spectral gap can be bounded using certain isoperimetric properties of the underlying network. We prove Cheeger’s inequality, which quantifies this relationship, and introduce expander graphs, an important family of graphs with good “expansion.” Applications to mixing times are also discussed. One specific technique is the “canonical paths method,” which bounds the spectral graph by formalizing a notion of congestion in the network.
In this chapter, we describe a few discrete probability models to which we will come back repeatedly throughout the book. While there exists a vast array of well-studied random combinatorial structures (permutations, partitions, urn models, Boolean functions, polytopes, etc.), our focus is primarily on a limited number of graph-based processes, namely percolation, random graphs, Ising models, and random walks on networks. We will not attempt to derive the theory of these models exhaustively here. Instead we will employ them to illustrate some essential techniques from discrete probability. Note that the toolkit developed in this book is meant to apply to other probabilistic models of interest as well, and in fact many more will be encountered along the way. After a brief review of graph basics and Markov chains theory, we formally introduce our main models. We also formulate various key questions about these models that will be answered (at least partially) later on. We assume that the reader is familiar with the measure-theoretic foundations of probability. A refresher of all required concepts and results is provided in the appendix.
Branching processes, which are the focus of this chapter, arise naturally in the study of stochastic processes on trees and locally tree-like graphs. Similarly to martingales, finding a hidden branching process within a probabilistic model can lead to useful bounds and insights into asymptotic behavior. After a review of the extinction theory of branching processes and of a fruitful random-walk perspective, we give a couple examples of applications in discrete probability. In particular we analyze the height of a binary search tree, a standard data structure in computer science. We also give an introduction to phylogenetics, where a “multitype” variant of the Galton–Watson branching process plays an important role; we use the techniques derived in this chapter to establish a phase transition in the reconstruction of ancestral molecular sequences. We end this chapter with a detailed look into the phase transition of the Erdos–Renyi graph model. The random-walk perspective mentioned above allows one to analyze the “exploration” of a largest connected component, leading to information about the “evolution” of its size as edge density increases.
In this chapter, we turn to martingales, which play a central role in probability theory. We illustrate their use in a number of applications to the analysis of discrete stochastic processes. After some background on stopping times and a brief review of basic martingale properties and results, we develop two major directions. We show how martingales can be used to derive a substantial generalization of our previous concentration inequalities – from the sums of independent random variables we focused on previously to nonlinear functions with Lipschitz properties. In particular, we give several applications of the method of bounded differences to random graphs. We also discuss bandit problems in machine learning. In the second thread, we give an introduction to potential theory and electrical network theory for Markov chains. This toolkit in particular provides bounds on hitting times for random walks on networks, with important implications in the study of recurrence among other applications. We also introduce Wilson’s remarkable method for generating uniform spanning trees.
Providing a graduate-level introduction to discrete probability and its applications, this book develops a toolkit of essential techniques for analysing stochastic processes on graphs, other random discrete structures, and algorithms. Topics covered include the first and second moment methods, concentration inequalities, coupling and stochastic domination, martingales and potential theory, spectral methods, and branching processes. Each chapter expands on a fundamental technique, outlining common uses and showing them in action on simple examples and more substantial classical results. The focus is predominantly on non-asymptotic methods and results. All chapters provide a detailed background review section, plus exercises and signposts to the wider literature. Readers are assumed to have undergraduate-level linear algebra and basic real analysis, while prior exposure to graduate-level probability is recommended. This much-needed broad overview of discrete probability could serve as a textbook or as a reference for researchers in mathematics, statistics, data science, computer science and engineering.
Recommend this
Email your librarian or administrator to recommend adding this to your organisation's collection.