1. Introduction, organization, and setup
In the last twenty years there has been a growing interest in complex networks and inhomogeneous particle systems. The classical meanfield framework (see e.g. [Reference Oelschläger23], [Reference Sznitman26]), in which all the particles are connected to each other, has been extended to include interactions described by general networks. In these more general models, the interaction between two particles depends on the weight of the edge connecting the two in an underlying network. Applications of these models include meanfield games [Reference Caines and Huang9, Reference Carmona, Cooney, Graves and Lauriere10], synchronization phenomena [Reference Arenas, DiazGuilera, Kurths, Moreno and Zhou2], neuroscience [Reference Rodrigues, Peron, Ji and Kurths25], and statistical mechanics [Reference Van Enter and Ruszel27], among others.
The first mathematically rigorous results on McKean–Vlasov particle systems and graphs appeared only recently [Reference Bhamidi, Budhiraja and Wu7, Reference Delattre, Giacomin and Luçon16]: these consider certain graph sequences with diverging average degree. Under suitable homogeneity conditions on the degrees, the system is described by the classical meanfield limit [Reference Sznitman26] as the number of particles tends to infinity. The cited works leave several relevant questions unanswered: Is it possible to characterize the graph sequences for which the system converges to the meanfield limit? How sensitive are the system dynamics to the degree inhomogeneity in the underlying graph? The existing literature (see e.g. [Reference Bayraktar, Chakraborty and Wu4], [Reference Coppini, Dietert and Giacomin15], [Reference Luçon20]) focuses on a large class of particle systems and different graph regimes, i.e. from dense to almost sparse graphs, yet they are unable to fully include the wellknown class of exchangeable random graphs [Reference Diaconis and Janson17] in their result.
Our aim is to prove that a law of large numbers for weakly interacting particle systems can be proved for any dense random graph sequence, including exchangeable random graphs. To keep the focus on the random graph limit viewpoint, we restrict the class of interacting particles to weakly interacting oscillators, a class of systems that shows a remarkably rich behavior [Reference Arenas, DiazGuilera, Kurths, Moreno and Zhou2, Reference Rodrigues, Peron, Ji and Kurths25]. We note that, at the cost of more technicalities, our result could be extended to more general particle systems. The main result of this work is a law of large numbers for the empirical measure under the only assumption that the graph sequence converges in probability in the space of graphons; see [Reference Lovász19] for a complete tour on graphons. Namely, we tackle the challenging case of a random graph limit, which includes pseudorandom graphs (see e.g. [Reference Basak, Bhamidi, Chakraborty and Nobel3], [Reference Chung, Graham and Wilson12]) and exchangeable random graphs (see e.g. [Reference Diaconis and Janson17]). To the authors’ knowledge, this is the first result in the literature that explicitly links random graph limits to empirical measures.
As a byproduct, we are able to identify deterministic and random graph sequences for which the particle system behavior is approximately meanfield. We also provide an example of a particle system undergoing a phase transition, where the critical threshold depends on the randomness of the graph limit; see Proposition 3.2 and the example below.
1.1. Unlabeled graphons and empirical measures
To better illustrate our contribution, a clarification of the term graphon is in order. In the literature of McKean–Vlasov particle systems, the term graphon is usually employed for what the classical graph limits theory (we refer to [Reference Diaconis and Janson17], [Reference Lovász19]) calls a labeled graphon, i.e. a symmetric, bounded, and measurable function $W\colon [0,1] \times [0,1] \to \mathbb{R}$ . The space [0,1] is known as the label space and it is used as particle index: roughly speaking, every $x \in [0,1]$ corresponds to a particle and the interaction strength between two particles x and y is given by the value of W(x, y).
The mathematical space of graphons is not merely given by labeled graphons but constructed via certain equivalence classes, that is,
where $W^\varphi (x, y) = W (\varphi(x), \varphi(y))$ for every x and y in [0,1]. The measurepreserving map $\varphi$ corresponds to a label permutation in a finite graph (note that $\varphi$ need not be invertible in the infinite setting; see [Reference Lovász19, Proposition 7.10]). In other words, $\widetilde{W}$ contains the connection information of some labeled graphon W, while being independent of its labeling. Following [Reference Diaconis and Janson17], we refer to this object as an unlabeled graphon. The space of graphons is the space of unlabeled graphons with a suitable distance; see Section 2.3 for further details.
The notion of labeling is relevant when studying interacting particle systems. On the one hand, we are interested in understanding the correlation between two particles and need their labels to be fixed to suitably describe it. On the other hand, we want to describe the macroscopic behavior, e.g. by studying the empirical measure, and there the labels do not play any role. Hence, depending on the question we are interested in, the suitable modeling choice might be labeled or unlabeled graphs. From a graphon viewpoint, considering unlabeled graphons is not a technical recourse but in fact represents the cornerstone of the theory and the key ingredient to many of its interesting properties. As will become clear later, when describing the dynamics of a particle population, the concept of labeled graphon is not satisfactory: different labeled graphons can lead to exactly the same dynamics!
Our contribution stems from the previous considerations. We refer to Section 3.3 for a comparison with the existing literature.
1.2. Organization
We now present the setup and notation used, as well as the various distances between probability measures that will be considered below. In Section 2 we give a brief introduction to graphons and the main tools needed for this work.
In Section 3 we define the interacting particle system and the associated nonlinear process. Existence, uniqueness, and stability results for the nonlinear process are presented immediately afterwards; see Theorem 3.1. Our main result, Theorem 3.2, is given in Section 3.2. Exchangeable random graphs are then discussed together with a propagation of chaos result; see Section 3.4. Section 3.5 is devoted to the comparison with the classical meanfield behavior and to a few important consequences of Theorem 3.2; the discussion is supported by two explanatory examples.
In Section 4 we focus on the nonlinear process. In particular, we discuss its relationship with other characterizations already known in the literature. The proofs of the main results for the nonlinear process are given in Section 4.4.
Section 5 contains the proof of Theorem 3.2. Finally, in Appendix A we give a useful characterization of convergence in probability for random graph sequences.
2. Setting and notations
We consider particle dynamics occurring on a finite time interval, say [0, T], which we fix once and for all. We work on the filtered probability space $(\Omega, {{{\mathcal F}}} , \{{{{\mathcal F}}} _t\}_{t \in [0,T]}, P)$ , where $\{{{{\mathcal F}}} _\cdot\}$ is a filtration satisfying the usual conditions.
We use two different notations for expressing conditional probabilities: the one referring to Brownian motions and initial conditions is denoted by ${{{\mathbf P}}} $ and its expectation by ${{{\mathbf E}}} $ ; the one referring to the randomness in the graph sequences, and/or in its limit object, is denoted by ${{{\mathbb P}}} $ and its expectation by ${{{\mathbb E}}} $ .
The interval $I\unicode{x2254} [0,1]$ represents the space of (continuous) labels. We study functions with values in the onedimensional torus denoted by ${{{\mathbb T}}} \unicode{x2254} \mathbb{R}/(2\pi\mathbb{Z})$ . The space of continuous functions from [0, T] to ${{{\mathbb T}}} $ is denoted by ${{{\mathcal C}}} ([0,T],{{{\mathbb T}}} )$ and is endowed with the supremum norm. For $\alpha>0$ , the space of $\alpha$ Höldercontinuous functions from ${{{\mathbb T}}} ^2$ to $\mathbb{R}$ is denoted by ${{{\mathcal C}}} ^{\alpha}({{{\mathbb T}}} ^2)$ . Given a separable metric space E, the space of probability measures over E is denoted by ${{{\mathcal P}}} (E)$ .
The various constants throughout the paper are always denoted by C or C ^{′} and may vary from line to line. An explicit dependence on a parameter $\alpha$ will be denoted by $C_\alpha$ .
2.1. Distance between probability measures
For two probability measures $\bar{\mu},\, \bar{\nu} \in {{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))$ , we define their distance by
where $\gamma(\bar{\mu},\bar{\nu})$ is the space of probability measures on ${{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ) \times {{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ with first marginal equal to $\bar{\mu}$ and second marginal equal to $\bar{\nu}$ . This definition coincides with the 2Wasserstein distance between probability measures. The righthand side of (2.1) can be rewritten as
where the infimum is taken on all random variables X and Y with values in ${{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ and law ${{{\mathcal L}}} $ equal to $\bar{\mu}$ and $\bar{\nu}$ respectively. From (2.1) we obtain that for every $s \in[0,T]$
where the supremum is taken over all 1Lipschitz functions from ${{{\mathbb T}}} $ to $\mathbb{R}$ . Observe that these definitions also make sense with $T=0$ and ${{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ replaced by ${{{\mathbb T}}} $ .
2.2. Distance between finite graphs
We denote $[n]\unicode{x2254} \{1,\ldots,n\}$ for $n \in \mathbb{N}$ . Let $\xi$ be a graph on n vertices. With an abuse of notation, we let $\xi$ denote its adjacency matrix as well, i.e. $\xi = \{\xi_{ij}\}_{i,j \in [n]}$ . We consider simple undirected graphs so that $\xi_{ij} =\xi_{ji}$ and $\xi_{ii} = 0$ for all $1 \leq i \leq j\leq n$ .
Let $A=\{A_{ij}\}_{i,j\in[n]}$ be an $n\times n$ real matrix. The cutnorm of A is a wellknown norm for (not necessarily square) finite matrices (see e.g. [Reference Alon and Naor1], [Reference Lovász19]); it is defined as
It is well known that this norm is equivalent to the $\ell_\infty\to \ell_1$ norm [Reference Alon and Naor1]
For two graphs $\xi$ and $\xi'$ on the same set of vertices, we define the distance $d_\square$ as
2.3. Labeled and unlabeled graphons
The following definitions come from the book by Lovász [Reference Lovász19], whose notation we adopt. Recall that $I=[0,1]$ and let
be the space of kernels; we tacitly consider two kernels to be equal if and only if the subset of $I^2$ where they differ has zero Lebesgue measure on $I^2$ . A graphon is a kernel W such that $0\leq W \leq 1$ . Let ${{{\mathcal W}}} _0$ denote the space of graphons. The cutnorm of $W\in{{{\mathcal W}}} $ is defined as
where the maximum is taken over all measurable subsets S and T of I. It is well known that $\lVert W\rVert_\square$ is equivalent to the norm of W seen as an operator from $L^\infty(I) \to L^1 (I)$ [Reference Lovász19, Theorem 8.11]. This is defined as
where $(Wg)(x) \unicode{x2254} \int_I W(x,y)g(y) \,{\text{d}} y$ for $x \in I$ and $g \in L^\infty (I)$ .
The metric induced by $\lVert\cdot\rVert_\square$ , or equivalently by $\lVert\cdot\rVert_{\infty \to 1}$ , in the space of graphons ${{{\mathcal W}}} _0$ is again denoted by $d_{\square}(\cdot,\cdot)$ . Definitions (2.4) and (2.5) are consistent in the sense that to each graph $\xi$ is associated a graphon $W_\xi \in {{{\mathcal W}}} _0$ such that $\lVert\xi\rVert_\square = \lVert W_\xi\rVert_\square$ . The graphon $W_\xi$ is usually defined almost everywhere as
Note that $W_\xi$ depends on the labeling of $\xi$ . Indeed, different labelings of $\xi$ yield graphs that have large $d_\square$ distance in general. This motivates the definition of the socalled cutdistance. For two graphs $\xi$ , $ \xi'$ with the same number of nodes, the cutdistance is defined as
where the minimum ranges over all labelings of $\xi'$ . The cutdistance is also defined for graphons as follows. For two graphons $W,V \in {{{\mathcal W}}} _0$ , their cutdistance is
where the minimum ranges over $S_I$ , the space of invertible measurepreserving maps from I into itself, and where $V^\varphi (x,y) \unicode{x2254} V(\varphi(x),\varphi(y))$ for $x,y \in I$ .
Remark 2.1. There are at least two ways to compare the graphs $\xi, \xi'$ as unlabeled objects: either by directly computing their distance $\hat{\delta}_\square$ or by computing the distance $\delta_\square$ between $W_\xi$ and $W_{\xi'}$ . These turn out to be equivalent as the number of vertices tends to infinity [Reference Lovász19, Theorem 9.29]. Formally, for every two graphs $\xi, \xi'$ on n vertices, it holds that
We always write $\delta_\square (\xi, \xi') \unicode{x2254} \delta_{\square} (W_\xi, W_{\xi'})$ .
Contrary to $d_\square$ , the cutdistance $\delta_\square$ is a pseudometric on ${{{\mathcal W}}} _0$ since the distance between two different graphons can be zero. This leads to the definition of the unlabeled graphon $\widetilde{W}$ associated to W. For a graphon W, $\widetilde{W}$ is defined as the equivalence class of W including all $V\in {{{\mathcal W}}} _0$ such that $\delta_\square(W,V)=0$ . For notation’s sake, we drop both the superscript and the adjective unlabeled when the context is clear.
The quotient space obtained in such a way is denoted by $\widetilde{{{{\mathcal W}}} }_0$ and we refer to it as the space of unlabeled graphons. A celebrated result of graph limits theory is that $(\widetilde{{{{\mathcal W}}} }_0, \delta_\square)$ is a compact metric space [Reference Lovász19, Theorem 9.23].
We will not go into the details of graph convergence, for which we refer to the exhaustive source [Reference Lovász19]. We only recall that a sequence of graphs $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ converges to the graphon $W \in \widetilde{{{{\mathcal W}}} }_0$ if and only if $\delta_\square(W_{\xi^{(n)}},W)\to 0$ as $n\to\infty$ [Reference Lovász19, Theorem 11.22].
3. The models and main results
3.1. The models
We introduce our two main models: a weakly interacting particle system (3.1) and a nonlinear process (3.3).
3.1.1. Weakly interacting oscillators on graphs.
Let $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ be a sequence of undirected, labeled graphs. For $n \in \mathbb{N}$ , the adjacency matrix of $\xi^{(n)}$ is given by the $n\times n$ symmetric matrix $\{\xi^{(n)}_{ij}\}_{i,j =1, \ldots, n}$ , where $\xi^{(n)}_{ij}=1$ whenever the vertices i and j are connected and $\xi^{(n)}_{ij}=0$ otherwise. Let $\{\theta^{i,n} \}_{i=1,\ldots,n}$ be the family of oscillators on ${{{\mathbb T}}} ^n$ that satisfy
where F and $\Gamma$ are bounded, uniformly Lipschitz functions and $\{B^i\}_{i\in\mathbb{N}}$ is a sequence of independent and identically distributed (i.i.d.) Brownian motions on ${{{\mathbb T}}} $ . The initial conditions $\{\theta_0^i \}_{i\in\mathbb{N}}$ are i.i.d. random variables sampled from some probability distribution $\bar{\mu}_0 \in {{{\mathcal P}}} ({{{\mathbb T}}} )$ which is fixed once and for all.
Remark 3.1. At the cost of more technical details, one can relax the hypothesis on the initial conditions and require independent but not identically distributed $\{\theta^i_0\}_{i \in \mathbb{N}}$ . Instead we keep the proofs simple, following the classical propagation of chaos arguments; see e.g. [Reference Sznitman26].
Many interesting examples of interacting oscillators fit this framework, such as the Kuramoto model, the plane rotator model, and other generalizations; see e.g. [Reference Delattre, Giacomin and Luçon16, § 1.2], [Reference Basak, Bhamidi, Chakraborty and Nobel3], and also Section 3.5. The applications of interacting oscillators range from modeling synchronization phenomena (see e.g. [Reference Arenas, DiazGuilera, Kurths, Moreno and Zhou2]) to neuroscience (e.g. [Reference Rodrigues, Peron, Ji and Kurths25]), as well as for studying statistical mechanics properties of complex systems [Reference Van Enter and Ruszel27].
We are interested in studying the empirical measure associated to (3.1). This is defined as the (random) probability measure on ${{{\mathbb T}}} $ such that
for every $t\in[0,T]$ . Alternatively, one can also consider the empirical measure $\mu^n$ of the trajectories, that is, on
where $\theta^{j,n} = (\theta^{j,n}_t)_{t\in[0,T]} \in {{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ is the trajectory on [0, T] of particle j for $j=1, \ldots, n$ . The two are equally considered in the literature; see e.g. [Reference Coppini, Dietert and Giacomin15, Remark 1.1]) for the relation between the two.
3.1.2. The nonlinear process.
Fix a graphon $W \in \widetilde{{{{\mathcal W}}} }_0$ and a uniform random variable U on I. Consider the solution $\theta = \{\theta_t\}_{t \in [0,T]}$ to the following system:
where the initial condition $\theta_0$ has law ${{{\mathcal L}}} (\theta_0)=\bar{\mu}_0$ and is taken to be independent of U. The Brownian motion B is independent of the previous sequence $\{B^i\}_{i \in \mathbb{N}}$ , of U, of the initial condition $\theta_0$ , and also independent of the sequence $\xi^{(n)}$ .
The next theorem on existence and uniqueness of the solution to equation (3.3) is proved in Section 4, together with wellposedness with respect to W. For the latter, see Remark 4.4.
Theorem 3.1. Suppose that F and $\Gamma$ are bounded, uniformly Lipschitz functions. For every uniform random variable U on I, there exists a unique pathwise solution to (3.3).
If $\bar{\mu} \in {{{\mathcal C}}} ([0,T], {{{\mathcal P}}} ({{{\mathbb T}}} ))$ denotes the law of the solution $\theta = \{\theta_t\}_{t \in [0,T]}$ and $\mu^x$ the law of $\theta$ conditioned on $U=x$ , then $\bar{\mu}$ solves the following nonlinear Fokker–Planck equation in the weak sense:
with initial condition $\bar{\mu}_0 \in {{{\mathcal P}}} ({{{\mathbb T}}} )$ .
3.2. Main result
We are now able to present our main result. Afterwards, we present an application to exchangeable random graphs and a propagation of chaos result.
Theorem 3.2. Assume the hypothesis of Theorem 3.1 and further suppose that $\Gamma \in {{{\mathcal C}}} ^{1+\varepsilon}({{{\mathbb T}}} ^2)$ for some $\varepsilon>0$ . Let $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ be a sequence of random graphs. Assume that there exists a random variable W in $\widetilde{{{{\mathcal W}}} }_0$ to which $\xi^{(n)}$ converges in ${{{\mathbb P}}} $ probability, or equivalently such that
Suppose that the initial conditions $\{\theta^i_0\}_{i \in \mathbb{N}}$ are i.i.d. random variables with law $\bar{\mu}_0$ and are independent of $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ . Then
where the convergence is in ${{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )))$ and $\bar{\mu}$ is a random variable depending only on the randomness of W, that is, for almost every $\omega \in \Omega$ , $\bar{\mu} (\omega)$ solves equation (3.4) starting from $\bar{\mu}_0$ , with graphon $W(\omega)$ .
The hypothesis (3.5) on the graph convergence is very general. Indeed, it holds for any $\xi^{(n)}$ in ${{{\mathcal W}}} $ , meaning that $\xi^{(n)}$ can take values in $\mathbb{R}$ rather than $\{0,1\}$ ; recall the definition of kernels in Section 2.3. Moreover, the sequence $\xi^{(n)}$ can be deterministic or random; if it is random, we also cover the convergence in probability to a random graphon limit W. In this last case, wellposedness and measurability of equation (3.4) are granted by Theorem 4.1.
Looking at the proof of Theorem 3.2, we remark that if the limiting graphon W is deterministic, the initial conditions $\{\theta^i_0\}_{i \in \mathbb{N}}$ can depend on the graph sequence $\{\xi^{(n)}\}_{n\in \mathbb{N}}$ . In other words, Theorem 3.2 also holds if $\{\theta^i_0\}_{i \in \mathbb{N}}$ is independent of the randomness in W but not necessarily on the whole sequence $\{\xi^{(n)}\}_{n\in \mathbb{N}}$ . The relationship between the randomness left in W and the randomness in $\xi^{(n)}$ is further discussed in Section 3.5.
Requiring that $\Gamma \in {{{\mathcal C}}} ^{1+\varepsilon}({{{\mathbb T}}} ^2)$ for $\varepsilon>0$ has no specific physical meaning. It is a technical assumption to ensure the convergence of estimates involving the Fourier coefficients of $\Gamma$ ; we refer to the proof of Theorem 3.2 in Section 5.
3.3. Comparison with the existing literature
Weakly interacting particle systems on graph sequences converging to graphons have already been considered in a series of works, both in the stochastic setting [Reference Bayraktar, Chakraborty and Wu4, Reference Luçon20, Reference Oliveira and Reis24] and in the deterministic one [Reference Chiba and Medvedev11, Reference Medvedev22]. Most of these results vary in terms of the class of particle systems considered, the notion of convergence of the graph sequence, or the hypothesis on the initial condition. To our knowledge, no result in the literature considers graph sequences issued from random graphons. In addition, we are not aware of any result directly using the cutdistance as we do.
We focus on dense graph sequences where, roughly speaking, the average degree in the graph is proportional to the number of vertices. All the cited results are established in more generality with respect to the graph density, i.e. they consider random graph sequences in different regimes, from dense to almost sparse. There is no common agreement on the terminology used for dense/sparse graphs; we refer to [Reference Coppini13] for a presentation of the subject.
In the deterministic setting, Medvedev and coauthors (see [Reference Chiba and Medvedev11], [Reference Medvedev22], and references therein) were the first to consider random graph sequences arising from a graphon. They focus on a restricted class of deterministic models, as the (deterministic) Kuramoto model. In the stochastic setting, Luçon [Reference Luçon20] consider particle systems defined on $\mathbb{R}^d$ with additive noise and a transport term in the dynamics; the initial conditions are assumed to be independent but not identically distributed. The hypothesis on the convergence of the graph sequence is somewhat involved as it is not expressed in terms of the cutnorm: the labeled graph limit must satisfy some extra regularity assumption, such as continuity or integrability. Oliveira and Reis [Reference Oliveira and Reis24] focus on a system of interacting particles on $\mathbb{R}$ where the dynamics is defined by means of a Hamiltonian, and the initial conditions are taken to be i.i.d. They establish a large deviation principle for the empirical measure by explicitly using the cutnorm for the convergence of graph sequences. The proofs are based on large deviation techniques. The labeled graph limit must be Lipschitz.
Bayraktar et al. [Reference Bayraktar, Chakraborty and Wu4] consider interacting particle systems on $\mathbb{R}^d$ with multiplicative noise. The graph sequences arise from deterministic graphons under regularity assumptions that are comparable to ours (namely, only measurability for establishing the law of large numbers). They establish a law of large numbers and a propagation of chaos property. Their proof is based on careful trajectorial estimates as well as a polynomial representation of the interaction, suitable for the use of the cutnorm. They do not consider random graph limits.
3.4. Applications to exchangeable graphs
Recall that an exchangeable random graph $\xi = \{\xi_{ij}\}_{i,j \in \mathbb{N}}$ (see [Reference Diaconis and Janson17]) is an infinite array of jointly exchangeable binary random variables, that is, it satisfies
for all $n \in \mathbb{N}$ , all permutations $\sigma$ on $\{1, \ldots, n\}$ , and all $e_{ij} \in \{0,1\}$ .
Remark 3.2. Any finite deterministic graph $\xi$ leads to an exchangeable random graph by performing a uniform random sampling on its associated graphon $W_\xi$ ; see (2.6) and [Reference Lovász19, §10].
More generally, for $W \in \widetilde{{{{\mathcal W}}} _0}$ one may construct an exchangeable random graph $\xi^W$ , usually called a Wrandom graph, defined for i and j in $\mathbb{N}$ by
where $\{U_i\}_{i \in \mathbb{N}}$ is a sequence of i.i.d. uniform random variables on I. As recalled by the next theorem, the converse is also true: every exchangeable random graph can be obtained in this way, provided that W is random.
The characterization of exchangeable random graphs is a consequence of the works of Hoover, Aldous, and Kallenberg; see [Reference Diaconis and Janson17] and references therein. We recall their main result here.
Theorem 3.3. ([Reference Diaconis and Janson17, Theorem 5.3], [Reference Lovász19, Theorem 11.52].) Let $\xi = \{\xi_{ij}\}_{i,j \in \mathbb{N}}$ be an exchangeable random graph. Then $\xi$ is a Wrandom graph for some random $W \in \widetilde{{{{\mathcal W}}} }_0$ . Moreover, let $\xi^{(n)} \unicode{x2254} \{\xi_{ij}\}_{i,j= 1, \ldots, n}$ for every $n \in \mathbb{N}$ . It holds that
as $n\to\infty$ .
We are now ready to state the main corollary of Theorem 3.2, which deals with exchangeable random graphs.
Corollary 3.1. Let $\xi = \{\xi_{ij}\}_{i,j \in \mathbb{N}}$ be an exchangeable random graph and let W be the limit of $\xi^{(n)}\unicode{x2254} \{\xi_{ij}\}_{i,j= 1, \ldots, n}$ in the sense of Theorem 3.3. Assume that the initial conditions $\{\theta^i_0\}_{i \in \mathbb{N}}$ are i.i.d. and independent of $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ . Then
where $\bar{\mu}$ is the solution to (3.4) starting from $\bar{\mu}_0$ with graphon W.
We say that $\xi = \{\xi^{(n)}\}_{n\in\mathbb{N}}$ is a sequence of exchangeable graphs when the random variables $\{\xi^{(n)}_{ij}\}_{i,j =1, \ldots, n}$ are exchangeable for each $n \in \mathbb{N}$ . Observe that $\xi$ is not necessarily an exchangeable random graph as in (3.7). Whenever $\xi = \{\xi^{(n)}\}_{n\in\mathbb{N}}$ is a sequence of exchangeable graphs, the particles $\{\theta^{i,n}\}_{i =1, \ldots, n}$ are exchangeable as well (recall that the initial conditions are assumed i.i.d.) and, in particular, their joint distribution is symmetric, i.e. invariant under permutation of the labels. Observe that this is not true when the graph is not exchangeable. A classical result by Sznitman [Reference Sznitman26, Proposition 2.2] is that the law of large numbers for the empirical measure of a symmetric joint distribution of particles is equivalent to the propagation of chaos property. From equation (3.6), we can thus deduce a propagation of chaos statement for the particle system (3.1). This is illustrated in the next proposition.
Proposition 3.1. Assume the hypothesis of Theorem 3.2. If $\xi = \{\xi^{(n)}\}_{n\in\mathbb{N}}$ is a sequence of exchangeable graphs, then for every $k \in \mathbb{N}$ ,
where ${{{\mathcal L}}} $ stands for the law in ${{{\mathbf P}}} \times {{{\mathbb P}}} $ probability and $\bar{\mu}$ is the solution to (3.4).
We omit the proof of Proposition 3.1.
3.5. Meanfield behavior and two explanatory examples
Theorem 3.2 allows for a better understanding of the relationship between random graph sequences and the behavior of the empirical measure.

(1) It highlights the difference between the randomness present in the graph $\xi^{(n)}$ for every $n \in \mathbb{N}$ and the randomness left in the limit W.

(2) It presents a new class of random Fokker–Planck equations as possible limit descriptions for the empirical measure $\mu^n$ .
As a byproduct, Theorem 3.2 yields a precise characterization of the graph sequences for which the empirical measure converges to the meanfield limit. Let us recall what we mean by meanfield limit and first discuss this last issue; we then address (1) and (2) with the help of two examples.
Consider system (3.1) on a sequence of complete graphs, i.e. $\xi^{(n)}_{ij} \equiv 1$ for every i, j, and n. It is well known [Reference Oelschläger23, Reference Sznitman26] that the empirical measure $\mu^n$ converges to the meanfield limit $\rho \in {{{\mathcal C}}} ([0,T],{{{\mathcal P}}} ({{{\mathbb T}}} ))$ , defined as the unique solution to the following McKean–Vlasov equation:
with initial condition $\bar{\mu}_0$ and $p=1$ . Existence and uniqueness for the solution to (3.10) hold under our assumptions on F, $ \Gamma$ , and $\bar{\mu}_0$ ; see again [Reference Oelschläger23] and [Reference Sznitman26].
Suppose that the graph sequence is converging to a deterministic limit; we discuss the case of a random limit after the first example. Theorem 3.2 implies that for every sequence $\{\xi^{(n)}\}_{n\in\mathbb{N}}$ which converges to some flat graphon $W\equiv p \in [0,1]$ , the empirical measure $\mu^n$ in the limit satisfies equation (3.10) with corresponding p. Since the convergence of $\xi^{(n)}$ to a nonconstant graphon gives rise to equation (3.4), which is – at least formally – different from (3.10), we conclude that if the sequence $\xi^{(n)}$ converges to a constant graphon, then the limit of $\mu^n$ is formally meanfield. The graphs with such asymptotic behavior are known in the literature as pseudorandom graphs; see [Reference Basak, Bhamidi, Chakraborty and Nobel3], [Reference Chung, Graham and Wilson12], and [Reference Lovász19, §11.8.1].
We now address the issues (1) and (2) with two explanatory examples. The meanfield comparison when the graph limit is random is discussed after the next example.
3.5.1. Example I: Wrandom graphs.
Fix $p \in (0,1)$ and let g be a random variable on (0, 1) with mean $\sqrt{p}$ and distribution function given by $F_g$ . Let $\{g_i\}_{i \in \mathbb{N}}$ be a sequence of i.i.d. copies of g. Conditionally on $\{g_i\}_{i\in \mathbb{N}}$ , $\xi^{(n)}_{ij}$ is defined as
The graph $\xi^{(n)}$ is the dense analogue of the inhomogeneous random graph, also known as a rank1 model; see e.g. [Reference Bet, van der Hofstad and van Leeuwaarden6] and [Reference Bogerd, Castro and van der Hofstad8]. In this model, $g_i$ corresponds to the weight associated with particle i and, loosely speaking, the closer $g_i$ is to 1, the more connections are formed by particle i. We expect that assigning different distributions to g leads to different behaviors for the empirical measure (3.2).
The construction made in (3.11) yields a binary array $\{\xi^{(n)}_{ij}\}_{i,j = 1, \ldots, n}$ of exchangeable random variables. In particular, the edges have the same expected value ${{{\mathbb E}}} [\xi^{(n)}_{ij}] = p$ , for every $i\neq j$ . We are interested in comparing the empirical measure of the system (3.1) defined on the graph (3.11) to the empirical measure of the corresponding annealed system that is obtained from (3.1) by replacing $\xi^{(n)}_{ij}$ with their expected value. More precisely, the annealed system is defined as the solution to
for which the asymptotic behavior is known to be the meanfield limit (3.10).
The behavior of system (3.1) on the graph sequence (3.11) is described in the limit by (3.12) only when g is deterministic and equal to $\sqrt{p}$ . Recall the definition of a Wrandom graph given in (3.8): we see that $\xi^{(n)}$ is a $W_g$ random graph with
where $F^{1}_g$ is the pseudoinverse of $F_g$ . In particular, the ${{{\mathbb P}}} $ a.s. limit of $\xi^{(n)}$ is given by $W_g$ and thus the limit of $\mu^n$ by the solution to equation (3.4) with $W=W_g$ . Theorems 3.2 and 4.1 imply that if $W_g$ is arbitrarily close to the constant graphon p in the cutdistance, i.e. if $\text{Var}[g] \ll 1$ , then the empirical measure of the system associated to $\xi^{(n)}$ is arbitrarily close to the meanfield limit of the annealed system (3.12). In this case, $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ is close to an Erdös–Rényi graph sequence, for which the meanfield behavior is already known; see [Reference Coppini, Dietert and Giacomin15]. Finally, observe that choosing a suitable deterministic sequence of the weights $\{g_i\}_{i \in [n]}$ , e.g. $g_i = F^{1}_g (i/n)$ for $i \in [n]$ , would lead to a random graph $\xi^{(n)}$ which is not exchangeable. In particular, ${{{\mathbb E}}} [\xi^{(n)}_{ij} ]$ is not constant and changes for every i and j. Nonetheless, the sequence $\xi^{(n)}$ still converges to the same limit $W_g$ , ${{{\mathbb P}}} $ a.s. in the realization of the Bernoulli random variables and possibly at the cost of requiring some regularity on $W_g$ ; see [Reference Lovász19, §11.4].
This example illustrates how the randomness related to the exchangeability in the sequence $\xi^{(n)}$ is lost in the limit of $\mu^n$ , as it is lost in the graph limit $W_g$ . In this sense, adding exchangeability to system (3.1) does not yield any averaging property on the empirical measure $\mu^n$ . Moreover, adding the extra randomness through Bernoulli random variables in (3.11) does not alter this fact. In other words, taking $\xi^{(n)}_{ij} = g_i g_j \in [0,1]$ yields yet again the same limit for $\mu^n$ .
Until now, we have focused on deterministic limits for the sequence $\xi^{(n)}$ . We now consider the case where the limit W is random, and we address the relationship between the resulting system and the meanfield limit $\rho$ given in (3.10). One might be led to conjecture that it is possible to recover the meanfield behavior by, for example, averaging the limit dynamics with respect to the randomness in W. In the next example, we formulate this remark in a rigorous way. We show that this is in general not possible, although it may lead to a new class of asymptotic behaviors which are interesting in their own right, as pointed out in (2) above.
3.5.2. Example II: random meanfield behavior.
Consider the growing preferential attachment graph $\xi_{\text{pa}}$ constructed iteratively as follows; see also [Reference Lovász19, Example 11.44]. Begin with a single node and, assuming that at the nth step there are already n nodes, create a new node with label $n+1$ and connect it to each node $i\in\{1,\ldots,n\}$ with probability $(d_n(i)+1)/(n+1)$ where $d_n(i)$ is the degree of node i at step n and each connection is made independently of the others. Denote the corresponding random graph by $\xi^{(n+1)}_{\text{pa}}$ .
Roughly speaking, the behavior of $\xi_{\text{pa}}$ depends crucially on the first steps of the construction and it stabilizes to a homogeneous structure as n grows. This is illustrated in the next proposition.
Proposition 3.2. ([Reference Lovász19, Proposition 11.45].) With probability 1, the sequence $\{\xi^{(n)}_{\textrm{pa}}\}_{n \in \mathbb{N}}$ converges to a random constant graphon.
Consider a particle system defined on the graph sequence $\{\xi^{(n)}_{\text{pa}}\}_{n \in \mathbb{N}}$ . The empirical measure converges to the solution of equation (3.10) with a random p. In other words, $\mu^n$ converges to a random meanfield limit. Integrating (3.10) with respect to this randomness and denoting ${{{\mathbb E}}} [\rho_t]$ by $\bar{\rho}_t$ for every $t \in [0,T]$ , we obtain that $\bar{\rho} \in {{{\mathcal C}}} ([0,T],{{{\mathcal P}}} ({{{\mathbb T}}} ))$ satisfies
for $t \in [0,T]$ . Note that (3.13) is not written in closed form because of the third term on the righthand side, which is not linear in $\rho$ and p. In this sense, $\bar{\rho}$ does not formally satisfy the meanfield limit, i.e. it is not a solution to (3.10) with some deterministic $p \in [0,1]$ . By definition, $\bar{\rho}$ is a mixture of meanfield limits, weighted by the distribution of p.
To have an intuitive understanding of what $\bar{\rho}$ may look like, consider the stochastic Kuramoto model without natural frequencies [Reference Bertini, Giacomin and Pakdaman5], [Reference Coppini14] defined on the sequence $\xi^{(n)}_{\text{pa}}$ . The model is defined as the solution to
for $i =1, \ldots, n$ and $t \in [0,T]$ . It corresponds to (3.1) with the choices $F\equiv 0$ and $\Gamma(\theta,\psi) = K\sin(\theta\psi)$ . An application of Theorem 3.2 and Proposition 3.2 implies that the empirical measure of (3.14) converges to the solution of
where $*$ stands for the convolution operator. It is well known that the system (3.15) undergoes a phase transition as the coupling strength pK crosses the critical threshold $pK=1$ . Hence the phase transition for this specific model occurs at a random critical threshold. Depending on the sampled value of p, one obtains stable synchronous solutions in the supercritical regime ( $pK >1$ ), or uniformly distributed oscillators on ${{{\mathbb T}}} $ ( $0\leq pK<1$ ). The solution to equation (3.15) can be written down explicitly (see again [Reference Bertini, Giacomin and Pakdaman5], [Reference Coppini14]) and, integrating over the randomness of p, gives a superposition of synchronous and asynchronous states which, in general, is not a meanfield solution, i.e. it does not solve (3.15) for some fixed $p \in [0,1]$ .
4. The nonlinear process
We introduce a nonlinear process (4.4) which has already been considered in the literature [Reference Bayraktar, Chakraborty and Wu4, Reference Chiba and Medvedev11, Reference Luçon20, Reference Luçon and Stannat21, Reference Oliveira and Reis24] as the natural candidate if the particles in (3.1) are not exchangeable and their labels are fixed from the initial condition. This process is useful for studying the evolution of a tagged particle with a specific profile of connections, as stressed in [Reference Luçon20].
Contrary to our setting, some regularity in the – now labeled – graphon is usually assumed to show the convergence of the empirical measure (3.2). We will exploit the nonlinear process with fixed labels (4.4) to better understand (3.3) and to establish existence and uniqueness.
Before introducing (4.4), we define some other tools for dealing with empirical measures and graphons. Notably, we introduce an equivalence relation between probability measures on $I \times {{{\mathbb T}}} $ inspired by graph limits theory; see (4.2). This will allow us to prove Theorem 4.1, where we establish that the empirical measure is Höldercontinuous with respect to the underlying graphon.
4.1. Distances between probability measures
Let ${{{\mathcal M}}} _T$ be the space of probability measures on $I \times {{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ with first marginal equal to the Lebesgue measure $\lambda$ on I, that is,
where $p_1$ is the projection map associated to the first coordinate. For $\mu \in {{{\mathcal M}}} _T$ the following decomposition holds:
where $\mu^x \in {{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))$ for almost every $x \in I$ . From now on, we denote the Lebesgue measure $\lambda ({\text{d}} x)$ on I simply by ${\text{d}} x$ . For $\mu, \nu \in {{{\mathcal M}}} _T$ , define their distance by
Remark 4.1. Observe that the previous expression makes sense, as
is a $\lambda$ measurable function. This is because we consider $\mu \in {{{\mathcal P}}} (I \times {{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))$ . If one considers ${{{\mathcal P}}} ({{{\mathcal C}}} ([0 ,T], {{{\mathbb T}}} ))^I$ , as done in [Reference Bayraktar, Chakraborty and Wu4], then an extra assumption is required, namely that every $\eta = \{\eta^x\}_{x \in I} \in {{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))^I$ is $\lambda$ measurable in $x \in I$ .
Remark 4.2. Observe that the previous definitions make sense also with $T=0$ and ${{{\mathcal C}}} ([0,T], {{{\mathbb T}}} )$ replaced by ${{{\mathbb T}}} $ . In particular, ${{{\mathcal M}}} _0$ is the space of probability measures on $I\times {{{\mathbb T}}} $ with first marginal equal to the Lebesgue measure $\lambda$ on I, that is,
and
Inspired by the graphon framework, one can define the following relation of equivalence on ${{{\mathcal M}}} _T$ (the case $T=0$ is analogous): for $\mu, \nu \in {{{\mathcal M}}} _T$ ,
Endow the quotient space ${{{\mathcal M}}} _T / \sim$ with the induced distance given by
where we have used the notation $\nu^\varphi = \{\nu^{\varphi(x)}\}_{x \in I}$ . Observe that if $\mu \sim \nu$ , then
In particular, for every $\varphi \in S_I$ ,
By taking the infimum with respect to $\varphi \in S_I$ , we obtain
4.2. The nonlinear process with fixed labels
Fix a labeled graphon $W \in {{{\mathcal W}}} _0$ together with an initial condition $\mu_0 \in {{{\mathcal M}}} _0$ . Consider the process $\theta = \{\theta^x\}_{x \in I}$ that solves the system
where $\{\theta^x_0\}_{x \in I}$ is a random vector such that ${{{\mathcal L}}} (\theta^x_0)= \mu^x_0$ for $x \in I$ and $\{B^x\}_{x\in I}$ a sequence of i.i.d. Brownian motions independent of $\{\theta^x_0\}_{x \in I}$ . The following proposition shows existence and uniqueness for the solution of (4.4). The proof follows a classical argument by Sznitman [Reference Sznitman26] and is postponed to Section 4.3.
Proposition 4.1. There exists a unique pathwise solution $\theta = \{\theta^x\}_{x \in I}$ to (4.4). If $\nu^x \in {{{\mathcal C}}} ([0,T],{{{\mathcal P}}} ({{{\mathbb T}}} ))$ denotes the law of $\theta^x$ for $x \in I$ , then $I \ni x \mapsto \nu^x \in {{{\mathcal P}}} ( {{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))$ is Lebesguemeasurable. For every $x \in I$ , $\nu^x$ satisfies the following nonlinear Fokker–Planck equation in the weak sense:
with initial condition $\mu^x_0 \in {{{\mathcal P}}} ({{{\mathbb T}}} )$ .
The process $\{\theta^x\}_{x \in I}$ is indexed by the space of labels I. For two different labels x and y in I, the behavior of particles $\theta^x$ and $\theta^y$ may vary depending on their connection profile encoded in W and the two marginals $\mu^x$ and $\mu^y$ may vary as well. Similar results in different settings have already been shown in [Reference Bayraktar, Chakraborty and Wu4], [Reference Caines and Huang9], [Reference Luçon20], [Reference Luçon and Stannat21], and [Reference Oliveira and Reis24].
It is interesting to know that the law $\mu=\{\mu^x\}_{x \in I} \in {{{\mathcal M}}} _T$ is continuous with respect to the cutnorm (or equivalently in $d_\square$ distance) in ${{{\mathcal W}}} _0$ , as already remarked in [Reference Bayraktar, Chakraborty and Wu4, Theorem 2.1] for systems much more general than those considered here. Exploiting the compactness of ${{{\mathbb T}}} $ and some extra regularity of $\Gamma$ , we are able to prove that the map $W\mapsto \mu^W$ is Höldercontinuous, as shown in the next proposition.
Proposition 4.2. Assume the hypothesis of Theorem 3.2. There exists a positive constant C, depending only on $\Gamma$ and T, such that if $\mu^W$ and $\mu^V$ denote the laws of the solutions to (4.4) with $W \in {{{\mathcal W}}} _0$ and $V \in {{{\mathcal W}}} _0$ respectively, then
The proof is based on classical trajectorial estimates and Fourier analysis. As $\Gamma$ is a function on the torus, it is possible to apply Fouriertype arguments to factorize it into its two components; this is the key point to make the graphon norm $\lVert\cdot\rVert$ appear and prove Proposition 4.2. The full proof is postponed to Section 4.3.
The above proposition can be translated in terms of the cutdistance $\delta_\square$ (recall (2.8)) and the space of graphons $\widetilde{{{{\mathcal W}}} }_0$ .
Theorem 4.1. Assume the hypothesis of Theorem 3.2. Then there exists a positive constant C, depending only on $\Gamma$ and T, such that if $\bar{\mu}^W$ and $\bar{\mu}^V$ denote the laws of the solutions to equation (3.3) associated with graphons W and V respectively, it holds that
The proof of Theorem 4.1 is a straightforward consequence of Proposition 4.2; it is given in Section 4.4.
The 2Wasserstein distance $D_T$ (recall the definition (2.1)) could be replaced by the pWasserstein distance in (2.1) for $p \geq 1$ . This would lead to a Hölder exponent in (4.6) as large as $1/p$ . We stick to $p=2$ but any choice could be possible, modulo a different constant C.
Observe that Theorems 3.1 and 4.1 imply that the following mapping is continuous:
where $\bar{\mu}^W$ is the law of $\theta$ solving equation (3.3) with graphon W. In particular, to every random variable W in $\widetilde{{{{\mathcal W}}} }_0$ corresponds a random variable $\bar{\mu}^W$ with values in ${{{\mathcal C}}} ([0,T], {{{\mathcal P}}} ({{{\mathbb T}}} ))$ , i.e. for almost every $\omega \in \Omega$ , $\bar{\mu}^W (\omega) = \bar{\mu}^{W(\omega)}$ .
Remark 4.3. We point out that Theorem 4.1 allows us to conclude that two solutions to equation (3.4) are close as probability measures if the corresponding graphons are close in $\widetilde{{{{\mathcal W}}} }_0$ . However, whether two different graphons can lead to similar behaviors in the particle system is still not clear. In other words, we are not able to provide any lower bound complementary to the upper bound given in equation (4.6). To our knowledge, this aspect may be modeldependent and needs further investigation.
4.2.1. Relation between label and unlabeled nonlinear processes.
Consider a probability distribution $\mu_0 \in {{{\mathcal M}}} _0$ such that $\int_I \mu^x_0 \, {\text{d}} x = \bar{\mu}_0$ . The solution to (3.4) is given by $\bar{\mu} = \int_I \mu^x \, {\text{d}} x$ , where $\mu^x$ is the law of $\theta^x$ solving (4.4) with initial condition $\mu^x_0$ and labeled graphon W. In other words, $\theta$ has the same law as $\theta^U$ solution to (4.4), where U is a uniform random variable in I independent of the other randomness in the system. As the following remark shows, the law $\bar{\mu}$ of $\theta$ depends on neither the representative W nor $\mu_0$ .
Remark 4.4. Let $\varphi \in S_I$ , i.e. $\varphi$ is an invertible measurepreserving map from I to itself, and let $\nu = \{ \nu^x\}_{x \in I}$ be the law of $\{\theta^{\varphi(x)}\}_{x \in I}$ solving (4.4). By a change of variables, $\theta^{\varphi(x)}$ solves
and can be rewritten with $V= W^\varphi$ and $\psi^x = \theta^{\varphi(x)}$ as
which has the same law as (4.4) with labeled graphon V and initial conditions $\{\theta^{\varphi(x)}\}_{x\in I}$ .
Observe that the laws $\nu$ and $\mu$ associated to (4.7) and (4.4), respectively, differ only in the labeling of the vertices but their distance in ${{{\mathcal M}}} _T$ is not zero due to the initial conditions and the fact that $\lVert WV\rVert_\square = \lVert WW^\varphi\rVert_\square$ is, in general, different from zero. However, if one looks at $\bar{\mu} = \int_I \mu^x \, {\text{d}} x$ and $\bar{\nu} = \int_I \nu^x \, {\text{d}} x$ , they coincide as probability measures in the sense that $D_T(\bar{\mu},\bar{\nu})=0$ . In particular, the law of the solution to equation (3.3) is also equivalent to $\psi^U$ , where $\psi^x$ solves (4.7), and U is uniformly distributed on I.
4.3. Proofs for the nonlinear process (4.4) with fixed labels
Proof of Proposition 4.1. The proof follows a classical argument given in [Reference Sznitman26, Lemma 1.3]. Consider $\nu \in {{{\mathcal M}}} _T$ and $\{\theta^{x,\nu}\}_{x \in I}$ solving
where the initial conditions and the Brownian motions are the same as (4.4). Observe that by Remark 4.1, the mapping $I \ni x \mapsto \nu^x_s \in {{{\mathcal P}}} ({{{\mathbb T}}} )$ is Lebesguemeasurable for every $s \in [0,T]$ . Since F and $\Gamma$ are bounded Lipschitz functions, there exists a unique solution to (4.8), which we denote by $\Phi(\nu) \in {{{\mathcal M}}} _T$ . The solution of (4.8) is constructed as the limit of a Cauchy sequence of elements of ${{{\mathcal M}}} _T$ . Since these are measurable as functions of $x\in I$ , the mapping $I \ni x \mapsto \text{Law}(\theta^{x,\nu})\in{{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb T}}} ))$ is also Lebesguemeasurable. Thus the map
is welldefined. A solution to (4.4) is a fixed point of $\Phi$ and any fixed point of $\Phi$ is a solution to (4.4).
For $\mu, \nu \in {{{\mathcal M}}} _T$ , consider the processes $\theta^{x,\mu}$ and $\theta^{x,\nu}$ , with $x \in I$ . We estimate their distance as
Adding and subtracting the quantity $\Gamma (\theta^{x,\mu}_s, \theta) \nu^y_s({\text{d}} \theta)$ in the second integral, and using that F and $\Gamma$ are Lipschitzcontinuous functions and that F, $ \Gamma$ , and W are bounded, we get
From (2.3) we obtain
from which, using (4.1), we deduce
The definition of $D_T$ (2.2) and an application of Gronwall’s lemma lead to
From the last relation we obtain the uniqueness of solutions to (4.4).
We prove that a solution exists by iterating (4.9). Indeed, for $k \geq 1$ and $\mu \in {{{\mathcal M}}} _T$ , we get
In particular, $\{\Phi^k (\mu)\}_{k \in \mathbb{N}}$ is a Cauchy sequence for k large enough, and its limit is the fixed point of $\Phi$ . Note that $d_t(\Phi(\mu), \mu) < \infty$ since we are working on the compact space ${{{\mathbb T}}} $ .
For the second part of Proposition 4.1, apply Itô’s formula to $f(\theta^x_t)$ with $f \in {{{\mathcal C}}} ^\infty_0$ to get
Integrating with respect to ${{{\mathbf P}}} $ yields the weak formulation of (4.5).
Next we move to the proof of Proposition 4.2.
Proof of Proposition 4.2. Let $\{\theta^{x,W}\}_{x \in I}$ and $\{\theta^{x,V}\}_{x \in I}$ be the two nonlinear processes associated to W and V respectively. We compare the two solutions: as in the proof of Proposition 4.1, by adding and subtracting the term
in the integrals, we get
Using that F and $\Gamma$ are Lipschitzcontinuous functions and that F, $ \Gamma$ , and W are bounded, we get
After taking the supremum over $s\in[0,t]$ , the expectation ${{{\mathbf E}}} $ and integrating with respect to $x \in I$ , we are able to apply Gronwall’s lemma as in (4.9) to get
where G is given by
Applying Gronwall’s inequality to (4.10) yields
The proof is concluded provided that $G \leq C' \lVert WV\rVert_\square$ , for some constant $C'>0$ .
We take advantage of the Fourier series representation of $\Gamma$ . As we are working on the torus, we can factorize $\Gamma$ into its two components, that is,
where
The condition $\Gamma \in {{{\mathcal C}}} ^{1+\varepsilon}$ allows us to have a decay estimate on the Fourier coefficients $\Gamma_{kl}$ . Indeed, classical results on the asymptotic of Fourier series [Reference Katznelson18, pp. 24–26] imply that
Plugging this expression into (4.11), we obtain that
Multiplying and dividing by $(kl)^{(1+\varepsilon)/2}$ , we are left with
where in the second step we applied the Cauchy–Schwarz inequality and (4.13). Using that W and V are bounded, as well as the fact that
we conclude
Since the norm $\lVert\,\cdot\,\rVert_{\infty\to 1}$ is equivalent to the cutnorm (2.5), the proof is concluded.
4.4. Proofs for the nonlinear process (3.3)
Proof of Theorem 3.1. The first part follows directly from Proposition 4.1 and Remark 4.4. The proof of (3.4) is similar to the proof of (4.5), but note that we are now integrating with respect to the randomness in U as well.
Proof of Theorem 4.1. Let $\theta^{U,W}$ and $\theta^{U,V}$ be the two solutions to (3.3) associated to W and V respectively, coupled by taking the same uniform random variable U. Let $\mu^{x,W}$ and $\mu^{x,V}$ represent the laws of $\theta^{U,W}$ and $\theta^{U,V}$ conditioned on $U=x$ , for $x \in I$ .
Consider $\varphi \in S_I$ an invertible measurepreserving map. Recall that $\theta^{\varphi(U),V}$ also satisfies equation (3.3) with $V^\varphi$ ; see Remark 4.4. We compare the trajectories $\theta^{U,W}$ and $\theta^{\varphi(U), V}$ .
Consider the difference between the equations satisfied by $\theta^{U,W}$ and $\theta^{\varphi(U), V}$ , add and subtract the term
to obtain that
The first two integrals on the righthand side are bounded by $C \int_0^s \bigl \theta^{U,W}_r  \theta^{\varphi(U),V}_r\bigr^2 \,{\text{d}} r $ , using that F and $\Gamma$ are Lipschitzcontinuous, while the third integral on the righthand side can be estimated using (2.3) and the fact that $0 \leq W \leq 1$ . Thus we get
where we have used the notation $\bigl(\mu^V\bigr)^\varphi$ for $\{\mu^{\varphi(y),V}\}_{y \in I}$ .
Taking the supremum over $s\in[0,t]$ and the expectation with respect to the Brownian motions, the initial conditions, and the random variable U, we obtain
where G is given by
In the proof of Proposition 4.2 we proved the following estimates:
Applying these bounds to (4.14) and using Gronwall’s inequality twice as in the previous proof, yields
By taking the infimum with respect to $\varphi \in S_I$ and recalling the definition of the cutdistance (2.8) together with (4.3), we obtain
The proof is concluded.
5. Proof of Theorem 3.2
In order to prove Theorem 3.2, we couple the system (3.1) to a sequence of identically distributed copies of the nonlinear process $\theta$ , which is obtained by sampling $\{U_i\}_{i \in \mathbb{N}}$ i.i.d. uniform random variables and choosing the same initial conditions and Brownian motions of (3.1).
For every $i \in \mathbb{N}$ , denote these copies by $\theta^i = \theta (U_i)$ . In particular, $\theta^i$ is defined as the solution for $t \in [0,T]$ to
Observe that $\{\theta^i \}_{i \in \mathbb{N}}$ is an exchangeable sequence and, in particular, that the variables $\theta^i$ are independent random variables when conditioned on the randomness of W.
Before the proof of Theorem 3.2, we give a trajectorial estimate.
Lemma 5.1. Under the hypothesis of Theorem 3.2, it holds that
Proof. As before, we compare the trajectories $\theta^{i,n}$ and $\theta^i$ , by studying the equation satisfied by $\bigl\theta^{i,n}_s  \theta^i_s \bigr^2$ ; recall (3.1) and (5.1). Add and subtract the term
in the integrals, to get
We now use the Lipschitz property of $\Gamma$ and F, sum over i and take the supremum over $s\in[0,t]$ , together with the expectation ${{{\mathbb E}}} \times {{{\mathbf E}}} $ , for which we just write E for simplicity:
Observe that the last term is bounded by a constant divided by n since by taking the conditional expectation with respect to $\theta^j$ and $U^j$ , we obtain for $i\neq j$
and, conditionally on W, the random variables $\{\theta^i\}_{i \in \mathbb{N}}$ are i.i.d.
Turning to the second term, we will prove that
where $W^{(n)}\unicode{x2254} \{W(U_i,U_j)\}_{i,j = 1, \ldots, n}$ is a Wrandom graph with n vertices; see (3.8). This together with a Gronwall argument implies that
and the claim follows by taking the limit for n which tends to infinity and the fact that $W^{(n)}$ converges ${{{\mathbb P}}} $ a.s. to W; recall Theorem 3.3.
Turning to (5.2), we use an argument similar to (4.11)–(4.13). Recall that since $\Gamma \in {{{\mathcal C}}} ^{1+\varepsilon}$ , it admits a Fourier series (4.12) with coefficients $\Gamma_{kl}$ such that
Plugging its Fourier expression in the lefthand side of (5.2), multiplying and dividing by $(kl)^{(1+\varepsilon)/2}$ , we get
where we used the Cauchy–Schwarz inequality as in the proof of Theorem 4.1. Observe that $\sum_{kl} (kl)^{1\varepsilon}$ is convergent and that $ {\text{e}}^{{\text{i}} \theta^i_s k}  \leq 1$ for all k and s: we can thus bound the previous term ${{{\mathbf P}}} $ a.s. by
Recall that $W^{(n)} = \{W(U_i,U_j)\}_{i,j = 1, \ldots, n}$ is a Wrandom graph with n vertices. Since the particles $\{\theta^i\}_{i \in \mathbb{N}}$ are exchangeable, every computation done so far holds for any order of $\{\theta^i\}_{i=1, \ldots, n}$ and, in particular, of $\{U^i\}_{i =1, \ldots, n}$ . In particular, the last inequality holds for every relabeling of $W^{(n)}$ .
From the definition of $\hat{\delta}_\square$ (2.7), one can thus take the labeling of $\{U_i\}_{i =1, \ldots, n}$ for every $n \in \mathbb{N}$ , such that
Using the asymptotic equivalence of $\hat{\delta}_\square$ with $\delta_\square$ (see Remark 2.1), the claim is proved and the proof is concluded.
Proof of Theorem 3.2. The equivalence between the convergence in ${{{\mathbb P}}} $ probability of $\xi^{(n)}$ and equation (3.9) is proved in Lemma A.1. We turn to the proof of the convergence of $\mu^n$ .
It is well known that the bounded Lipschitz distance (recall (2.3)) metrizes the weak convergence and defines a distance between probability measures. In particular, in order to show that $\mu^n$ converges in ${{{\mathbf P}}} \times {{{\mathbb P}}} $ probability to $\bar{\mu}$ in ${{{\mathcal P}}} ({{{\mathcal C}}} ([0,T], {{{\mathbb R}}} ))$ , it is enough to prove that
for every f bounded and Lipschitz function with values in ${{{\mathcal C}}} ([0,T], {{{\mathbb R}}} )$ .
Using the fact that $\bar{\mu}$ is the law of $\{\theta^i\}_{i \in \mathbb{N}}$ (recall (5.1)), it is enough to show that
This is implied by the fact that f is Lipschitz and by Jensen’s inequality. Indeed,
which goes to zero as $n\to\infty$ by Lemma 5.1.
Appendix A. Graph convergence and random graphons
A.1. Convergence in probability
The characterization of the convergence in distribution for a sequence of graphs was originally given in [Reference Diaconis and Janson17]. Here we give a useful notion of convergence in $\widetilde{{{{\mathcal W}}} }_0$ by means of the cutdistance $\delta_\square$ , which is equivalent to the convergence in probability for graph sequences.
Lemma A.1. Assume that $\{\xi^{(n)}\}_{n \in \mathbb{N}}$ is a sequence of random graphs and W is a random graphon in $\widetilde{{{{\mathcal W}}} }_0$ . Then $\xi^{(n)}$ converges in ${{{\mathbb P}}} $ probability to W if and only if (3.5) holds, i.e. if and only if
Proof. Recall that $(\widetilde{{{{\mathcal W}}} }_0, \delta_\square)$ is a compact metric space, so that the convergence of $\xi^{(n)}$ in probability is equivalent to
Observe that the sequence of positive real random variables $\{ \delta_{\square} (\xi^{(n)}, W)\}_{n \in \mathbb{N}}$ is uniformly bounded by 1. Equation (A.1) is then equivalent to the convergence in $L^1$ , i.e. equivalent to (3.5).
Acknowledgements
FC is thankful to his supervisor Giambattista Giacomin for insightful discussions and advice. FC acknowledges the support from the European Union’s Horizon 2020 research and innovation programme under the Marie SkłodowskaCurie grant agreement 665850. FRN was partially supported by the Netherlands Organisation for Scientific Research (NWO) (Gravitation Grant 024.002.003–NETWORKS).
Funding information
There are no funding bodies to thank relating to this creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.