1. Introduction
The ubiquitous availability of large, structured network data in various scientific areas ranging from biology to social sciences has been a driving force in the development of statistical network models [Reference Kolaczyk23, Reference Newman30]. Vertex-exchangeable random graphs, also known as W-random graphs or graphon models [Reference Aldous1, Reference Diaconis and Janson15, Reference Hoover19, Reference Lovász and Szegedy28], offer in particular a flexible and tractable class of random graph models. It includes many models, such as the stochastic block-model [Reference Nowicki and Snijders31], as special cases. Various parametric and nonparametric model-based approaches [Reference Latouche and Robin25, Reference Lloyd, Orbanz, Ghahramani and Roy26, Reference Palla, Lovász and Vicsek33] and nonparametric estimation procedures [Reference Chatterjee14, Reference Gao, Lu and Zhou16, Reference Wolfe and Olhede41] have been developed within this framework. Although the framework is very flexible, it is known that vertex-exchangeable random graphs are dense [Reference Lovász and Szegedy28, Reference Orbanz and Roy32]; that is, the number of edges scales quadratically with the number of nodes. This property is considered unrealistic for many real-world networks.
To achieve sparsity, rescaled graphon models have been proposed in the literature [Reference Bickel and Chen4, Reference Bickel, Chen and Levina5, Reference Bollobás and Riordan8, Reference Wolfe and Olhede41]. While these models can capture sparsity, they are not projective; additionally, standard rescaled graphon models cannot simultaneously capture sparsity and a clustering coefficient bounded away from zero (see Section 5).
These limitations have been overcome by another line of works initiated by [Reference Borgs, Chayes, Cohn and Holden9, Reference Caron and Fox12, Reference Veitch and Roy38]. They showed that, by modelling the graph as an exchangeable point process, one can naturally extend the classical vertex-exchangeable/graphon framework to the sparse regime, while preserving its flexibility and tractability. In such a representation, introduced by [Reference Caron and Fox12], nodes are embedded at some location $\theta_i\in\mathbb R_+$ , and the set of edges is represented by a point process on the plane,
where $Z_{ij}=Z_{ji}$ is a binary variable indicating whether there is an edge between node $\theta_i$ and node $\theta_j$ . Finite-size graphs are obtained by restricting the point process (1) to points $(\theta_i,\theta_j)$ such that $\theta_i,\theta_j\leq\alpha$ , with $\alpha$ a positive parameter controlling the size of the graph. Focusing on a particular construction as a case study, [Reference Caron and Fox12] showed that one can obtain sparse and exchangeable graphs within this framework; it also pointed out that exchangeable random measures admit a representation theorem due to [Reference Kallenberg22], giving a general construction for such graph models. The papers [Reference Herlau, Schmidt and Mørup18, Reference Todeschini, Miscouridou and Caron37] developed sparse graph models with (overlapping) community structure within this framework. The papers [Reference Borgs, Chayes, Cohn and Holden9, Reference Veitch and Roy38] showed how such a construction naturally generalises the dense exchangeable graphon framework to the sparse regime, and analysed some of the properties of the associated class of random graphs, called graphex processes. (The paper [Reference Veitch and Roy38] introduced the term graphex and referred to the class of random graphs as Kallenberg exchangeable graphs, but the term graphex processes is now more commonly used.) Further properties were derived by [Reference Borgs, Chayes, Cohn and Veitch10, Reference Janson20, Reference Janson21, Reference Veitch and Roy39]. Following the notation of [Reference Veitch and Roy38], and ignoring additional terms corresponding to stars and isolated edges, the graph is then parametrised by a symmetric measurable function $W\,:\,\mathbb R^2_+\rightarrow[0,1]$ , where for each $i\leq j$ ,
where $(\theta_k,\vartheta_k)_{k=1,2,\ldots}$ is a unit-rate Poisson process on $\mathbb R^2_+$ . See Figure 1 for an illustration of the model construction. The function W is a natural generalisation of the graphon for dense exchangeable graphs [Reference Borgs, Chayes, Cohn and Holden9, Reference Veitch and Roy38], and we refer to it as the graphon function.
This paper investigates asymptotic properties of the general class of graphs based on exchangeable point processes defined by Equations (1) and (2). Our findings can be summarised as follows.
-
(i) We relate the sparsity and power-law properties of the graph to the tail behaviour of the marginal of the graphon function W, identifying four regimes: (a) a dense regime, (b) a sparse (almost dense) regime without power-law behaviour, (c) a sparse regime with power-law behaviour, and (d) an almost extremely sparse regime. In the sparse, power-law regime, the power-law exponent is in the range (1, 2).
-
(ii) We derive the asymptotic properties of the global and local clustering coefficients, two standard measures of the transitivity of the graph.
-
(iii) We give a central limit theorem for subgraph counts and for the number of nodes in the graph.
-
(iv) We introduce a parametrisation that allows us to model separately the global sparsity structure and other local properties such as community structure. Such a framework enables us to sparsify any dense graphon model, and to characterise its sparsity properties.
-
(v) We show that the results apply to a wide range of sparse and dense graphex processes, including the models studied by [Reference Caron and Fox12, Reference Herlau, Schmidt and Mørup18, Reference Todeschini, Miscouridou and Caron37].
Some of the asymptotic results are illustrated in Figure 2 for a specific graphex process in the sparse, power-law regime.
The article is organised as follows. In Section 2 we give the notation and the main assumptions. In Section 3, we derive the asymptotic results for the number of nodes, degree distribution, and clustering coefficients. In Section 4, we derive central limit theorems for subgraphs and for the number of nodes. Section 5 discusses related work. In Section 6 we provide specific examples of sparse and dense graphs and show how to apply the results of the previous section to those models. In Section 7 we describe a generic construction for graphs with local/global structure and adapt some results of Section 3 to this setting. Most of the proofs are given in the main text, with some longer proofs in the appendix, together with some technical lemmas and background material. Other, more technical proofs are given in supplementary material [Reference Caron, Panero and Rousseau13].
Throughout the document, we use the expressions $X_\alpha \sim Y_\alpha$ and $X_\alpha=o(Y_\alpha)$ respectively for $X_\alpha/Y_\alpha\rightarrow 1$ and $X_\alpha/Y_\alpha\rightarrow 0$ . Both $X_\alpha\lesssim Y_\alpha$ and $X_\alpha=O(Y_\alpha)$ are used to express $\lim\sup X_\alpha/Y_\alpha<\infty$ . The notation $X_\alpha\asymp Y_\alpha$ means both $X_\alpha\lesssim Y_\alpha$ and $Y_\alpha\lesssim X_\alpha$ hold. All unspecified limits are when $\alpha$ tends to infinity. When $X_\alpha$ and/or $Y_\alpha$ are random quantities, the asymptotic relation is meant to hold almost surely.
2. Notation and assumptions
2.1. Notation
Let $M=\sum_{i}\delta_{(\theta_{i},\vartheta_{i})}$ be a unit-rate Poisson random measure on $(0,+\infty)^{2}$ , and let $W\,:\,[0,+\infty)^{2}\rightarrow\lbrack0,1]$ be a symmetric measurable function such that $\lim_{x\to\infty} W(x,x)$ and $\lim_{x\to 0} W(x,x)$ both exist (by (3), this implies that $\lim_{x\to\infty} W(x,x)=0$ ) and
Let $(U_{ij})_{i,j\in\mathbb{N}^{2}}$ be a symmetric array of independent random variables, with $U_{ij}\sim U(0,1)$ if $i\leq j$ and $U_{ij}=U_{ji}$ for $i>j$ . Let $Z_{ij}=\mathbb{1}_{U_{ij}\leq W(\vartheta_{i},\vartheta_{j})}$ be a binary random variable indicating whether there is a link between i and j, where $\mathbb{1}_{A}$ denotes the indicator function.
Restrictions of the point process $\sum_{ij} Z_{ij}\delta_{(\theta_i,\theta_j)}$ to squares $[0,\alpha]^2$ then define a growing family of random graphs $(\mathcal G_\alpha)_{\alpha\geq 0}$ , called a graphex process, where $\mathcal G_\alpha=(\mathcal V_\alpha,\mathcal E_\alpha)$ denotes a graph of size $\alpha\geq 0$ with vertex set $\mathcal V_\alpha$ and edge set $\mathcal E_\alpha$ , defined by
The connection between the point process and graphex process is illustrated in Figure 3. The conditions (3) are sufficient (though not necessary) conditions for $|\mathcal E_\alpha|$ (hence $|\mathcal V_\alpha|$ ) to be almost surely finite, and for the graphex process to be well defined [Reference Veitch and Roy38, Theorem 4.9]. Note crucially that the graphs $\mathcal G_\alpha$ have no isolated vertices (that is, no vertices of degree 0), and that the number of nodes $|\mathcal V_\alpha|$ and the number of edges $|\mathcal E_\alpha|$ are both random variables.
We now define a number of summary statistics of the graph $\mathcal G_\alpha$ . For $i\geq 1$ , let
If $\theta_i\in \mathcal V_\alpha$ , then $D_{\alpha,i}\geq 1$ corresponds to the degree of the node $\theta_i$ in the graph $\mathcal G_\alpha$ of size $\alpha$ ; otherwise $D_{\alpha,i}=0$ . Let $N_{\alpha}=|\mathcal V_\alpha|$ and $N_{\alpha,j}$ be the number of nodes and the number of nodes of degree $j,\,j\ge 1$ , respectively,
and $N^{(e)}_{\alpha}=|\mathcal E_\alpha|$ the number of edges
For $i\geq 1$ , let
If $\theta_i\in \mathcal V_\alpha$ , $T_{\alpha, i}$ corresponds to the number of triangles containing node $\theta_i$ in the graph $\mathcal G_\alpha$ ; otherwise $T_{\alpha, i}=0$ . Let
denote the total number of triangles and
the total number of adjacent edges in the graph $\mathcal G_\alpha$ . The global clustering coefficient, also known as the transitivity coefficient, is defined as
if $A_{\alpha}\geq 1$ and 0 otherwise. The global clustering coefficient counts the proportion of closed connected triplets over all the connected triplets, or equivalently the fraction of pairs of nodes connected to the same node that are themselves connected, and is a standard measure of the transitivity of a network [Reference Newman30, Section 7.9]. Another measure of the transitivity of the graph is the local clustering coefficient. For any degree $j\geq 2$ , define
if $N_{\alpha,j}\geq 1$ and 0 otherwise. Then $C_{\alpha,j}^{(\ell)}$ corresponds to the proportion of pairs of neighbours of nodes of degree j that are connected. The average local clustering coefficient is obtained by
if $N_{\alpha}-N_{\alpha,1}\geq 1$ and $\overline{C}_{\alpha}^{(\ell)}=0$ otherwise.
2.2. Assumptions
We will make use of the following three assumptions. Assumption 1 characterises the behaviour of the small-degree nodes. Assumption 2 is a technical assumption to obtain the almost sure results. Assumption 3 characterises the behaviour of large-degree nodes.
A central quantity of interest in the analysis of the asymptotic properties of graphex processes is the marginal generalised graphon function $\mu\,:\,(0,\infty)\to\mathbb R_+$ , defined for $x>0$ by
The integrability of the generalised graphon W implies that $\mu$ is integrable. Ignoring loops (self-edges), the expected number of connections of a node with parameter $\vartheta$ is proportional to $\mu(\vartheta)$ . Therefore, assuming $\mu$ is monotone decreasing, its behaviour at infinity controls the small-degree nodes, while its behaviour at zero controls the large-degree nodes.
For mathematical convenience, it will be easier to work with the generalised inverse $\mu^{-1}$ of $\mu$ . The behaviour at zero of $\mu^{-1}$ then controls the small-degree nodes, while the behaviour of $\mu^{-1}$ at infinity controls large-degree nodes.
The following assumption characterises the behaviour of $\mu$ at infinity or, equivalently, of $\mu^{-1}$ at zero. We require $\mu^{-1}$ to behave approximately as a power function $x^{-\sigma}$ around zero, for some $\sigma\in[0,1]$ . This behaviour, known as regular variation, has been extensively studied (see, e.g., [Reference Bingham, Goldie and Teugels6]) and we provide some background on it in Appendix C.
Assumption 1. Assume $\mu$ is non-increasing, with generalised inverse $\mu^{-1}(x)=\inf\{ y>0 \mid \mu(y)\leq x\}$ , such that
where $\sigma\in\lbrack0,1]$ and $\ell$ is a slowly varying function at infinity: for all $c>0$ , $\lim_{t\rightarrow\infty}\ell(ct)/\ell(t)=1.$
Examples of slowly varying functions $\ell$ include functions converging to a strictly positive constant, and powers of logarithms. Note that Assumption 1 implies that, for $\sigma\in(0,1)$ , $\mu(t)\sim \overline \ell(t)t^{-1/\sigma}\text{ as }t\rightarrow\infty$ for some slowly varying function $\overline \ell$ . We can differentiate four cases, as will be formally derived in Corollary 1:
-
(i) Dense case: $\sigma=0$ and $\lim_{t\rightarrow\infty}\ell(t)<\infty$ . In this case, $\lim_{x\rightarrow 0}\mu^{-1}(x)<\infty$ , and hence $\mu$ has bounded support. The other three cases are all sparse cases.
-
(ii) Almost dense case: $\sigma=0$ and $\lim_{t\rightarrow\infty}\ell(t)=\infty$ . In this case $\mu$ has full support and super-polynomially decaying tails.
-
(iii) Sparse case with power law: $\sigma\in(0,1)$ . In this case $\mu$ has full support and polynomially decaying tails (up to a slowly varying function).
-
(iv) Very sparse case: $\sigma=1$ . In this case $\mu$ has full support and very light tails. In order for $\mu^{-1}$ (and hence W) to be integrable, we need $\ell$ to go to zero sufficiently fast.
Now define, for $x,y>0$ ,
The expected number of common neighbours of nodes with parameters $(\vartheta_1,\vartheta_2)$ is proportional to $\nu(\vartheta_1,\vartheta_2)$ .
The following assumption is a technical assumption needed in order to obtain the almost sure results on the number of nodes and degrees. The paper [Reference Veitch and Roy38] made a similar assumption to obtain results in probability; see the discussion section for further details.
Assumption 2. Assume that there exist $C_1,a >0$ and $x_0\geq 0$ such that for all $x,y>x_0$ ,
Remark 1. Assumption 2 is trivially satisfied when the function W is separable, $W(x,y)= \mu(x)\mu(y)/\overline W.$ Assumptions 1 and 2 are also satisfied if
for some positive, non-increasing, measurable function f with $\overline f=\int_0^\infty f(x)dx<\infty$ and generalised inverse $f^{-1}$ satisfying $f^{-1}(x)\sim\ell(1/x)x^{-\sigma}\text{ as }x$ tends to 0. In this case, $\mu$ is monotone non-increasing. We have
as x tends to 0 by dominated convergence. Hence $f\{\mu^{-1}(x)\}\sim x$ as x tends to 0 and $f^{-1}[f\{\mu^{-1}(x)\}]\sim \ell(1/x)x^{-\sigma}$ . Assumption 2 follows from the inequality $W(x,y)\leq f(x)f(y)/\overline f$ . Other examples are considered in Section 6.
The following assumption is used to characterise the asymptotic behaviour of both small- and large-degree nodes.
Assumption 3. Assume $\mu^{-1}(t)=\int_t^\infty f(x)dx$ where f is continuous on $(0, \infty)$ and the following hold:
where $\tau>0$ , $\tilde{\sigma} \leq 1$ , and $\ell_{2}, \tilde{\ell}_2$ are slowly varying functions.
Note that Assumption 3 implies that $\mu^{-1}(x)\sim x^{-\tau}\ell_2(x)\text{ as }x\to\infty$ , and $\mu(t)\sim \overline \ell_2(t)t^{-1/\tau}$ as $t\rightarrow 0$ , for some slowly varying function $\overline \ell_2$ . Assumption 3 also implies Assumption 1 with $\sigma = \max( \tilde \sigma , 0)$ , $\ell(x)=\frac{1}{\sigma} \tilde{\ell}_2(x)$ if $\tilde \sigma\neq 0$ , and $\ell(x)=o(\tilde{\ell}_2(x))$ if $\tilde \sigma= 0$ .
Finally, we state an assumption on $\nu(x, y)$ , the quantity proportional to the expected number of common neighbours of two nodes with parameters x and y, defined in Equation (16). This technical assumption is used to prove a result on the asymptotic behaviour of the variance of the number of nodes (Proposition 3) and the central limit theorem for sparse graphs enunciated in Section 4.3.
Assumption 4. Assume that there exist $0<C_{0}\leq C_1$ and $x_{0}\geq0$ such that for all $x,y>x_{0}$ ,
Assumption 4 holds when W is separable, as well as in the model of [Reference Caron and Fox12] under some moment conditions (see Section 6.5). Obviously, Assumption 4 implies that Assumption 2 is satisfied with $a=1$ .
3. Asymptotic behaviour of various statistics of the graph
3.1. Asymptotic behaviour of the number of edges, number of nodes, and degree distribution
In this section we characterise the almost sure and expected behaviour of the number of nodes $N_\alpha$ , number of edges ${N_{\alpha}^{(e)}}$ , and number of nodes with j edges $N_{\alpha,j}$ . These results allow us to provide precise statements about the sparsity of the graph and the asymptotic power-law properties of its degree distribution.
We first recall existing results on the asymptotic growth of the number of edges. The growth of the mean number of edges has been shown by [Reference Veitch and Roy38], and the almost sure convergence follows from [Reference Borgs, Chayes, Cohn and Holden9, Proposition 56].
Proposition 1. (Number of edges [Reference Borgs, Chayes, Cohn and Holden9, Reference Veitch and Roy38].) As $\alpha $ goes to infinity, almost surely
The following two theorems provide a description of the asymptotic behaviour of the terms $N_\alpha, N_{\alpha,j}$ in expectation and almost surely.
Theorem 1. For $\sigma\in[0,1]$ , let $\ell_\sigma$ be slowly varying functions defined as
Under Assumption 1, for all $\sigma \in [0,1]$ ,
If $\sigma = 0$ then for $j\geq 1$
If $\sigma\in(0,1)$ then for $j\geq 1$
Finally, if $\sigma= 1$ , then
Theorem 1 follows rather directly from asymptotic properties of regularly varying functions [Reference Gnedin, Hansen and Pitman17], recalled in Lemmas B.2 and B.3 in the appendix. Details of the proof are given in Appendix A.1. Note that $\ell(\alpha)=o(\ell_1(\alpha))$ ; hence, for $\sigma=1$ , $E\big(N_{\alpha,j}\big)=o\{E(N_{\alpha,1})\}$ for all $j\geq 2$ .
The paper [Reference Veitch and Roy38] shows that, under Assumption 2 with $a=1$ , we have, in probability,
The next theorem shows that the asymptotic equivalence holds almost surely under Assumptions 1 and 2. Additionally, combining these results with Theorem 1 allows us to characterise the almost sure asymptotic behaviour of the number of nodes and the number of nodes of a given degree. The proof of Theorem 2 is given in Section 3.2.
Theorem 2. Under Assumptions 1 and 2, we have almost surely as $\alpha$ tends to infinity
Combining this with Theorem 1, we obtain that, for all $\sigma\in[0,1]$ ,
Moreover, for $j\geq 1$ , if $\sigma = 0$ then $N_{\alpha,j}=o\{\alpha\ell(\alpha)\} $ , while if $0< \sigma < 1$ then
If $\sigma = 1$ , then $N_{\alpha,1}\sim \alpha^{2}\ell_1(\alpha)$ and for all $j \geq 2$ we also have $N_{\alpha,j}=o\{\alpha^{2}\ell_1(\alpha)\}.$
The following result is a corollary of Theorem 2 which shows how the parameter $\sigma$ relates to the sparsity and power-law properties of the graphs. We denote by $\ell^\#$ the de Bruijn conjugate (see Definition C.2 in the appendix) of the slowly varying function $\ell$ .
Corollary 1. (Sparsity and power-law degree distribution.) Assume Assumptions 1 and 2. For $\sigma\in[0,1]$ , almost surely as $\alpha$ tends to infinity,
The function $\ell_\sigma^*(y)$ is slowly varying and the graph is dense if $\sigma=0$ and $\lim_{t\rightarrow\infty} \ell(t)=C<\infty$ , as ${N_{\alpha}^{(e)}}/N_{\alpha}^2\rightarrow C^2\overline W/2$ almost surely. Otherwise, if $\sigma>0$ or $\sigma=0$ and $\lim_t \ell(t)=\infty$ , the graph is sparse, as ${N_{\alpha}^{(e)}}/N_{\alpha}^2\rightarrow 0$ . Additionally, for $\sigma\in [0,1)$ , for any $j=1,2,\ldots$ ,
almost surely. If $\sigma>0$ , this corresponds to a degree distribution with power-law behaviour, as, for j large,
For $\sigma=1$ , $N_{\alpha,1}/N_\alpha\rightarrow 1$ and $N_{\alpha,j}/N_\alpha\rightarrow 0$ for $j \geq 2$ ; hence the nodes of degree 1 dominate in the graph.
Remark 2. If $\sigma=0$ and $\lim_{t\rightarrow\infty} \ell(t)=\infty$ , the graph is almost dense; that is, ${N_{\alpha}^{(e)}}/N_\alpha^{2}\rightarrow 0$ and ${N_{\alpha}^{(e)}}/N_\alpha^{2-\epsilon}\rightarrow \infty$ for any $\epsilon>0$ . If $\sigma=1$ , the graph is almost extremely sparse [Reference Bollobás and Riordan8], as ${N_{\alpha}^{(e)}}/N_\alpha\rightarrow \infty\text{ and }{N_{\alpha}^{(e)}}/N_\alpha^{1+\epsilon}\rightarrow 0$ for any $\epsilon>0$ .
The above results are important in terms of modelling aspects, since they allow a precise description of the degrees and number of edges as a function of the number of nodes. They can also be used to conduct inference on the parameters of the statistical network model, since the behaviour of most estimators will depend heavily on the behaviour of $N_\alpha$ , $N_{\alpha}^{(e)} $ , and possibly $N_{\alpha,j}$ . For instance the naive estimator of $\sigma$ given by
is almost surely consistent. Note that, following an earlier version of the present paper, [Reference Naulet, Sharma, Veitch and Roy29] proposed an alternative estimator for $\sigma$ , with better statistical properties. Indeed, under Assumptions 1 and 2, using Theorems 1 and 2, we have almost surely $N_\alpha^2\sim\alpha^{2+2\sigma}\ell_\sigma(\alpha)^2$ and $N_\alpha^{(e)}\sim \alpha^2\overline W/2$ . Hence
and the result follows as $\log\ell_\sigma(\alpha)/\log\alpha\rightarrow 0$ .
All of the above results concern the behaviour of small-degree nodes, where the degree j is fixed as the size of the graph goes to infinity. It is also of interest to look at the number of nodes of degree j as both $\alpha$ and j tend to $\infty$ . We show in the next proposition that this is controlled by the behaviour of the function f, introduced in Assumption 3, at 0 or $\infty$ .
Proposition 2. (Power law for high-degree nodes.) Assume that Assumption 3 holds. Then when $j \rightarrow \infty$ , $\log \alpha = o(j)$ , and $j/\alpha \rightarrow c_0 \in [0, \infty]$ , we have
Note that Proposition 2 implies that when $j/\alpha\to\infty$ ,
which corresponds to power-law behaviour with exponent $1+\tau$ . If $j/\alpha\to 0$ then
This is similar to the asymptotic results for j fixed, stated in Theorem 1, noting that
as $j\to\infty$ . Finally, if $j/\alpha \rightarrow c_0 \in (0,\infty)$ , then $ E\big(N_{\alpha,j}\big) \sim f(c_0) \in (0,\infty)$ .
Proof. Under Assumption 3, we have $\mu^{-1}(t)=\int_t^\infty f(x)dx$ with
From [Reference Veitch and Roy38, Theorem 5.5] we have, assuming that $W(x,x)=0$ for the sake of simplicity,
where $X_j$ is a gamma random variable with rate $j+1$ and inverse scale $j+1$ . We split the above expectation into $X_j < 1/2$ , $X_j \in [1/2, 3/2]$ , and $X_j>3/2 $ . The idea is that the third and the first expectations are small because $X_j$ concentrates fast to 1, while the middle expectation ( $X_j \in [1/2, 3/2]$ ) uses the fact that $f((j+1)X_j/\alpha) \approx f((j+1)/\alpha) $ . More precisely, using Stirling’s approximation, for every $\epsilon>0$ there exists $c>0$ such that
since $\alpha/j = o(e^{cj }) $ for any $c>0$ . The expectation over $X_j >3/2$ is treated similarly. We now study the expectation over $[1/2, 3/2]$ . We have that if $j/\alpha \rightarrow \infty $ , then uniformly in $x \in [1/2, 3/2]$ , under Assumption 3,
and similarly when $j/\alpha \rightarrow 0$ , with $\tau $ replaced by $\tilde \sigma$ ; if $j/\alpha \rightarrow c_0 \in (0,\infty)$ , then uniformly in $x \in [1/2, 3/2]$ ,
Moreover, since $X_j $ converges almost surely to 1, we finally obtain that
which terminates the proof.
3.2. Proof of Theorem 2
The proof follows similarly to that of [Reference Veitch and Roy38, Theorem 6.1], by bounding the variance. The paper [Reference Veitch and Roy38] showed that $\textrm{var}(N_{\alpha})=o(E(N_{\alpha})^2)$ and $\textrm{var}\big(N_{\alpha,j}\big)=o(E\big(N_{\alpha,j}\big)^2)$ and used this result to prove that (22) holds in probability; we need a slightly tighter bound on the variances to obtain the almost sure convergence. This is stated in the next two propositions.
Proposition 3. Let $N_\alpha$ be the number of nodes. We have
Under Assumptions 1 and 2, with $\sigma\in[0,1]$ , slowly varying function $\ell$ , and positive scalar a satisfying (17), we have
where the slowly varying functions $\ell_\sigma$ are defined in Equation (20). Additionally, under Assumptions 1 and 4, we have, for any $\sigma\in[0,1]$ and any slowly varying function $\ell$ ,
Sketch of the proof. We give here the ideas behind the proof, deferring its completion to Section A.1 of the supplementary material [Reference Caron, Panero and Rousseau13]. Equation (25) is immediately obtained using the Slivnyak–Mecke and Campbell theorems. Applying the inequality $e^{x}-1\le xe^x$ and Lemmas B.2 and B.6 to the right-hand side of Equation (25), we obtain the upper bound of Equation (26). Finally, if Assumption 4 holds, then Assumption 2 holds as well with $a=1$ . Combining this with Assumption 1, we can therefore specialise the upper bound of Equation (26) to the case $a=1\,:\, O(\alpha^{1+2\sigma}\ell^2_\sigma(\alpha))$ . The lower bound with the same order is found using the inequality $e^x-1\ge x$ and Lemmas B.2 and B.3.
Proposition 3 and Theorem 1 imply in particular that, under Assumptions 1 and 2,
for some $\kappa>0$ . Here $N_\alpha$ is a positive, monotone increasing stochastic process. Using Lemma B.1 in the appendix, we obtain that $N_\alpha\sim E(N_\alpha)$ almost surely as $\alpha$ tends to $\infty$ .
Proposition 4. Let $N_{\alpha,j}$ be the number of nodes of degree j. Then, under Assumptions 1 and 2, with $\sigma\in[0,1]$ , slowly varying function $\ell$ , and positive scalar a satisfying (17), we have
where the slowly varying functions $\ell_\sigma$ are defined in Equation (20). In the case $\sigma=0$ and $a=1$ , we have the stronger result
Sketch of the proof. While the complete proof of Proposition 4 is given in Section A.2 in the supplementary material [Reference Caron, Panero and Rousseau13], we explain here its main passages. We start by evaluating the expectation of $N_{\alpha, j}^2$ and $N_{\alpha, j}$ conditional on the unit-rate Poisson random measure $M=\sum_{i}\delta_{(\theta_{i},\vartheta_{i})}$ :
where $b=(b_{11},b_{12},b_{22})\in\{0,1\}^3$ . We then use the Slivnyak–Mecke theorem to obtain $E(N_{\alpha,j}^{2})-E\big(N_{\alpha,j}\big)$ , which can be bounded by a sum of terms of the form
for $k_1,k_2,r\in\{0,\ldots,j\}$ . For terms with $r\geq1$ , we use Lemma A.1 (enunciated and proved, using Lemmas B.2 and B.4, in Section A.2 of the supplementary material). The lemma states that, under Assumptions 1 and 2, the integral in (28) is in $O\big( \alpha^{r-2ar+2\sigma } \ell_\sigma^2(\alpha)\big)\big)=O\big( \alpha^{1-2a+2\sigma } \ell_\sigma^2(\alpha)\big)\big)$ for any $r\ge 1$ , $k_1,k_2\ge 0$ . For terms with $r=0$ in (28), we use the inequality $e^{x}\leq 1+xe^x$ , the Cauchy–Schwarz inequality, and Lemma B.4 to show that these terms are in $O\big\{\alpha^{3+2\sigma-2a}\ell_\sigma^2(\alpha)\big\}$ , which completes the proof.
Define $\widetilde{N}_{\alpha,j} =\sum_{k\geq j}N_{\alpha,k}$ , the number of nodes of degree at least j. Note that $\widetilde{N}_{\alpha,j}$ is a positive, monotone increasing stochastic process in $\alpha$ , with $ \widetilde{N}_{\alpha,j} = N_\alpha - \sum_{k = 1}^{j-1} N_{\alpha,k}$ . We then have, using the Cauchy–Schwarz and Jensen inequalities, that
Consider first the case $\sigma\in[0,1)$ . Since Theorem 1 implies, for $j\geq 2$ , $\alpha^{1+\sigma}\ell(\alpha) \lesssim E\big(\widetilde{N}_{\alpha,j}\big)$ as $\alpha$ goes to infinity, using Propositions 3 and 4, we obtain $\textrm{var}\big(\widetilde N_{\alpha,j}\big) =O\{\alpha^{-\tau} E\big(\widetilde N_{\alpha,j}\big)^2 \}$ for some $\tau>0$ . Combining this with Lemma B.1 leads to $\widetilde N_{\alpha,j}\sim E\big(\widetilde N_{\alpha,j}\big)$ almost surely as $\alpha$ goes to infinity.
The almost sure results for $N_{\alpha,j}$ then follow from the fact that, for all $j\geq 2$ , $E\big(\widetilde N_{\alpha,j}\big)\asymp E(N_\alpha)$ if $\sigma\in(0,1)$ , $E\big(\widetilde N_{\alpha,j}\big)\sim E(N_\alpha)$ if $\sigma=0$ , and $E\big(\widetilde N_{\alpha,j}\big)=o\{E(N_\alpha)\}$ if $\sigma=1$ .
3.3. Asymptotic behaviour of the clustering coefficients
The following proposition is a direct corollary of [Reference Borgs, Chayes, Cohn and Holden9, Proposition 56], which showed the almost sure convergence of subgraph counts in graphex processes.
Proposition 5. (Global clustering coefficient [Reference Borgs, Chayes, Cohn and Holden9].) Assume $\int_{0}^{\infty}\mu(x)^{2}dx<\infty$ . Recall that $T_\alpha$ and $A_\alpha$ are respectively the number of triangles and the number of adjacent edges in the graph of size $\alpha$ . We have
almost surely as $\alpha\rightarrow\infty$ . Therefore, if $\int_{0}^{\infty}\mu(x)^{2}dx>0$ , the global clustering coefficient defined in Equation (11) converges to a constant
Note that if $\mu$ is monotone decreasing, as $\overline{W}<\infty$ , we necessarily have $\int_{a}^{\infty}\mu(x)^{2}dx<\infty$ for any $a>0$ . Hence the condition $\int_{0}^{\infty}\mu(x)^{2}dx<\infty$ in Proposition 5 requires additional assumptions on the behaviour of $\mu$ at 0 (or equivalently the behaviour of $\mu^{-1}$ at $\infty$ ), which drives the behaviour of large-degree nodes. If the graph is dense, $\mu$ is bounded and thus $\int_{0}^{\infty}\mu(x)^{2}dx<\infty$ .
Proposition 6. (Local clustering coefficient.) Assume Assumptions 1 and 2 hold with $\sigma\in(0,1)$ . Assume additionally that
for some $b\in[0,1]$ . Then the local clustering coefficients converge in probabiltiy as $\alpha\rightarrow\infty$ :
If $b>0$ , the above result holds almost surely, and the average local clustering coefficient satisfies
In general,
and the global clustering and local clustering coefficients converge to different limits. A notable exception is the separable case where $W(x,y)=\mu(x)\mu(y)/\overline{W}$ , since in this case
and
Sketch of the proof. Full details are given in Appendix A.2; here we give only a sketch of the proof, which is similar to that of Theorem 2. We have
Here $R_{\alpha,j}$ corresponds to the number of triangles having a node of degree j as a vertex; triangles having $k\leq 3$ degree-j nodes as vertices are counted k times.
We obtain an asymptotic expression for $E\big(R_{\alpha,j}\big)$ , and show that $\textrm{var}\big( R_{\alpha,j} \big) = O\big(\alpha^{1 -2a} \big[E\big(R_{\alpha,j}\big)\big]^2 \big)$ . We then prove that $R_{\alpha,j}/E\big(R_{\alpha,j}\big) $ goes to 1 almost surely. The latter is obtained by proving that $ R_{\alpha,j}$ is nearly monotone increasing by constructing an increasing sequence $\alpha_n $ going to infinity such that $E\big(R_{\alpha_n,j}\big)/E\big(R_{\alpha_{n+1},j}\big)$ goes to 1 and such that for all $\alpha \in (\alpha_n, \alpha_{n+1})$ ,
Roughly speaking, $\tilde R_{n,j}$ (defined in Equation (60)) corresponds to the sum of the number of triangles $T_{\alpha_{n+1} i}$ , over the set of nodes i such that $D_{n,i}\leq j$ and i has at least one connection with some ${i^{\prime}}$ such that $\theta_{i^{\prime}} \in (\alpha_n, \alpha_{n+1})$ . The result for the local clustering coefficient then follows from Toeplitz’s lemma (see e.g. [Reference Loève27, p. 250]).
4. Central limit theorems
We now present central limit theorems (CLTs) for subgraph counts (numbers of edges, triangles, etc.) and for the number of nodes $N_\alpha$ . Subgraph counts can be expressed as U-statistics of Poisson random measures (up to an asymptotically negligible term). A CLT then follows rather directly from CLT on U-statistics of Poisson random measures [Reference Reitzner and Schulte35].
Obtaining a CLT for quantities like $N_\alpha$ is more challenging, since these cannot be reduced to U-statistics. In this section we prove the CLT for $N_\alpha$ ; we separate the dense and sparse cases because the techniques of the respective proofs are very different. The proof of the sparse case requires additional assumptions and is much more involved. We believe that the same technique of proof can be used for other quantities of interest, such as the number $N_{\alpha,j}$ of nodes of degree j, with more tedious computations.
4.1. CLT for subgraph counts
4.1.1. Statement of the result
Let F be a given subgraph which has neither isolated vertices nor loops. Denote by $|F|$ the number of nodes, $\{1, \cdots, |F|\}$ the set of vertices, and e(F) the set of edges. Let $N^{(F)}_{\alpha}$ be the number of subgraphs F in the graph $\mathcal{G}_\alpha$ :
where $k_{(F)}$ is a constant accounting for the multiple counts of F, which we can omit in the rest of the discussion since it does not depend on $\alpha$ . Note that this statistic covers the number of edges (excluding loops) if $|F|=2$ and the number of triangles if $|F|=3$ and $e(F)=\{(1,2),(1,3),(2,3)\}$ . It is known in the graph literature as the number of injective adjacency maps from the vertex set of F to the vertex set of $\mathcal{G}_\alpha$ ; see [Reference Borgs, Chayes, Cohn and Holden9, Section 2.5].
Proposition 7. Let F be a subgraph without loops or isolated vertices. Assume that $\int_0^\infty \mu(x)^{2|F|-2}dx<\infty$ . Then
as $\alpha$ goes to infinity, where
and
for some positive constant $c_F$ that depends only on F.
Remark 3. If the graph is dense, then $\mu$ is a bounded function with bounded support and therefore $\int_0^\infty \mu(x)^{p}dx<\infty$ for any p. In the sparse case, if $\mu$ is monotone, we necessarily have $\int_a^\infty \mu(x)^p dx<\infty$ for any $p>1$ . The condition $\int_0^\infty \mu(x)^{2|F|-2}dx<\infty$ therefore requires additional assumptions on the behaviour of $\mu$ at 0, which drives the behaviour of large-degree nodes.
4.1.2. Proof
Recall that $M=\sum_{i}\delta_{(\theta_{i},\vartheta_{i})}$ . The main idea of the proof is to use the decomposition
and to show that $E\big(N^{(F)}_{\alpha}|M\big)$ is a geometric U-statistic of a Poisson process, for which a CLT has been derived by [Reference Reitzner and Schulte35].
In this section, denote by $K=|F|\geq 2$ the number of nodes of the subgraph F. The subgraph counts are
where $\mathbb S_K$ denotes the set of permutations of $\{1,\ldots, K\}$ .
Using the extended Slivnyak–Mecke theorem, we have
As $\int_0^\infty \mu(x)^{K-1}dx<\infty$ , [Reference Borgs, Chayes, Cohn and Holden9, Lemma 62] implies that $E\big(N^{(F)}_{\alpha}\big)<\infty$ . For any $K\geq 2$ , define the symmetric function
using the condition (3) and the fact that $\int_0^\infty \mu(x)^{K-1}dx<\infty$ , it satisfies $0 <\int_{\mathbb R_+^{K}} f(x_1,\ldots,x_K)dx_1\ldots dx_K<\infty.$
We state the following useful lemma.
Lemma 1. The function f satisfies, for all $x_K\geq 0$ ,
for some constant $C_0$ .
Proof. Let $\pi\in\mathbb S_K$ and $r_K\in \{1,\ldots,K\}$ be such that $\pi_{r_K}=K$ . Denote by $S\subseteq \{1,\ldots,$ $K-1\}$ the set of indices i such that $(i,r_K)\in e(F)$ and i has no other connections in F. Then
for some constant $C_1$ .
It follows from Lemma 1 and from the fact that $\int_0^\infty\mu(x)dx<\infty$ that, if $\int_0^\infty \mu(x)^{2K-2}$ $dx<\infty$ , then
We are now ready to derive the asymptotic expression for the variance of $N^{(F)}_\alpha$ . Using the extended Slivnyak–Mecke theorem again,
It follows that
as $\alpha$ tends to infinity, where
We now prove the CLT. The first term of the right-hand side of Equation (32) takes the form
By the superposition property of Poisson random measures, we have
where the right-hand side is a geometric U-statistic [Reference Reitzner and Schulte35, Definition 5.1] of the Poisson point process $\big\{\big(\widetilde \theta_i,\widetilde \vartheta_i\big)_{i\geq 1}\big\}$ with mean measure $\alpha d\widetilde \theta d\widetilde\vartheta$ on $[0,1]\times\mathbb R_+$ . Theorem 5.2 in [Reference Reitzner and Schulte35] therefore implies that
where $\textrm{var}\big(E\big(N_{\alpha}^{(F)} \mid M\big)\big)\sim \textrm{var}\big(N_{\alpha}^{(F)}\big)\sim k_{(F)}^2 |F|^2 \alpha^{2|F|-1} \sigma_F^2$ . One can show similarly (proof omitted) that $\textrm{var}\big(N_{\alpha}^{(F)}-E\big(N_{\alpha}^{(F)}\mid M\big)\big)=o\big(\alpha^{2|F|-1}\big)$ . It follows from Equations (32) and (35) and the Chebyshev inequality that
as $\alpha$ tends to infinity.
4.2. CLT for $N_\alpha$ (dense case)
4.2.1. Statement of the result
In the dense case, $\mu$ has bounded support. If it is monotone decreasing, then Assumption 1 is satisfied with $\sigma=0$ , and $\ell(t)=\sup\{x>0\mid\mu(x)>0\}$ is constant. In this case a CLT applies, as described in the following theorem.
Theorem 3. (Dense case.) Assume that Assumption 1 holds with $\sigma=0$ and $\ell(t)= C\in(0,\infty)$ , where $C=\sup\{x>0\mid\mu(x)>0\}$ (dense case). Also assume that Assumption 2 holds with $a=1$ . Then
Moreover, $E(N_\alpha) = \alpha C -m_{\alpha,0}$ , where
The quantity $m_{\alpha,0}$ can be interpreted as the expected number of degree-0 nodes, and is finite in the dense case. As shown in the following examples, $m_{\alpha,0}$ can either diverge or converge to a constant as $\alpha$ tends to infinity.
Example 1. Consider $\mu(x)=\mathbb{1}_{x\in\lbrack0,1]}$ , $\mu(x)=(1-x)^{2}\mathbb{1}_{x\in\lbrack0,1]}$ , and $\mu(x)=(1-x)^{3}\mathbb{1}_{x\in\lbrack0,1]}$ . We respectively have $m_{\alpha,0}\rightarrow0$ , $m_{\alpha,0}\sim\frac{\sqrt{\pi}}{2}\alpha^{1/2}$ , and $m_{\alpha,0}\sim\Gamma(4/3)\alpha^{2/3}$ .
The above CLT for $N_\alpha$ can be generalised to $\widetilde N_{\alpha,j}=\sum_{k\geq j} N_{\alpha,k}$ , the number of nodes of degree at least j.
Theorem 4. Assume that Assumption 1 holds with $\sigma=0$ and $\ell(t)= C\in(0,\infty)$ , where $C=\sup\{x>0\mid\mu(x)>0\}$ (dense case). Also assume that Assumption 2 holds with $a=1$ . Then for any $j\geq 1$ ,
Moreover, $E\big(\widetilde N_{\alpha,1}\big)=E(N_\alpha)=\alpha C-m_{\alpha,0}$ , and for $j\geq 2$ , $E\big(\widetilde N_{\alpha,j}\big) = \alpha C -m_{\alpha,0}-\sum_{k=1}^{j-1}E\big(N_{\alpha,j}\big)$ , where $m_{\alpha,0}$ is defined in Equation (37) and $E\big(N_{\alpha,j}\big)$ is defined in Equation (53). Note that $m_{\alpha,0}=o(\alpha)$ , and for any $j\geq 1$ , $E\big(N_{\alpha,j}\big)=o(\alpha)$ .
4.2.2. Proof
Given a point ( $\theta,\vartheta$ ) such that $\vartheta>C$ , its degree is necessarily equal to zero, as $\mu(\vartheta)=0$ . Write
$Q_{\alpha}$ is the total number of nodes i with $\theta_i\leq \alpha$ that could have a connection (hence such that $\mu(\vartheta_i)>0$ ), and
is the set of nodes i with degree 0, but for which $\theta_i\leq\alpha$ , $\mu(\vartheta_i)>0$ . In the dense regime, both $Q_{\alpha}$ and $N_{\alpha,0}$ are almost surely finite. Furthermore, $(Q_{\alpha})_{\alpha\geq 0}$ is a homogeneous Poisson process with rate C. By the law of large numbers, $Q_{\alpha}\sim\alpha C\sim N_{\alpha}$ almost surely as $\alpha$ tends to infinity. Using Campbell’s theorem, the Slivnyak–Mecke formula, and monotone convergence, we have $E(N_{\alpha,0})=\alpha\int_{0}^{C}(1-W(x,x))e^{-\alpha\mu(x)}dx=o(\alpha).$ We also have that
Hence, using the inequality $e^{x}-1\leq xe^{x}$ , we obtain
Using Lemma B.6 in the appendix and Assumption 2 with $a=1$ , we have
It follows that $\textrm{var}(N_{\alpha,0})=o(\alpha)$ . This implies, by Chebyshev’s inequality, the CLT for Poisson processes, and Slutsky’s theorem, that
This concludes the proof of Theorem 3. The proof of Theorem 4 follows similarly. Note that the case $j=1$ in Theorem 4 corresponds to Theorem 3. For any $j\geq 2$ , $\widetilde N_{\alpha,j}=Q_{\alpha}-N_{\alpha,0}-\sum_{k=1}^{j-1} N_{\alpha,k}.$ We have, using the Cauchy–Schwarz inequality and Proposition 4,
This implies
4.3. CLT for $N_\alpha$ (sparse case)
4.3.1. Statement of the result
We now assume that we are in the sparse regime; that is, $\mu$ has unbounded support. We make the following additional assumption in order to prove the asymptotic normality. This holds when W is separable, as well as in the model of [Reference Caron and Fox12] under some moment conditions (see Section 6.5).
Assumption 5. Assume that for any $j\leq 6$ and any $(x_1,\ldots,x_j)\in\mathbb R_+^j$ ,
where L is a locally integrable, slowly varying function converging to a (strictly positive) constant, such that
We now state the CLT for $N_\alpha$ under the sparse regime. Recall that in this case, when Assumption 1 holds, we have either $\sigma=0$ and $\ell(t)\to\infty$ or $\sigma\in(0,1]$ .
Theorem 5. (Sparse case.) Assume that $\mu$ has unbounded support (sparse regime). Under Assumptions 1, 4, and 5, we have
4.3.2. Proof
The proof uses the recent results of [Reference Last, Peccati and Schulte24] on normal approximations of nonlinear functions of a Poisson random measure. We have the decomposition
where
is a nonlinear functional of the Poisson random measure M, and
is a linear functional of M with $h_{\alpha}(\theta,\vartheta)=\mathbb{1}_{\theta\leq\alpha}\left[ 1-(1-W(\vartheta,\vartheta))e^{-\alpha\mu(\vartheta)}\right] $ . Theorem 5 is a direct consequence of the following three propositions and of Slutsky’s theorem.
Proposition 8. Under Assumptions 1 and 4, we have
hence
Proposition 9. Under Assumptions 1 and 4, we have
hence, if $\mu$ has unbounded support,
The above two propositions are proved in Section B of the supplementary material [Reference Caron, Panero and Rousseau13].
Proposition 10. Assume $\mu$ has unbounded support. Under Assumptions 1, 4, and 5, we have
Sketch of the proof. To prove Proposition 10 we resort to [Reference Last, Peccati and Schulte24, Theorem 1.1] on the normal approximation of nonlinear functionals of Poisson random measures. Define
where $v_{\alpha}= \textrm{var}(f_{\alpha}(M))\sim \textrm{var}(N_{\alpha} )\asymp\alpha^{1+2\sigma}\ell_\sigma^{2}(\alpha)$ . Note that $E(F_\alpha)=0$ and $\textrm{var}(F_\alpha)=1$ . Consider the difference operator $D_{z}F_\alpha$ defined by
and also
Define
In Section B.3 of the supplementary material [Reference Caron, Panero and Rousseau13], we prove that under Assumptions 1, 4, and 5, we have $\gamma_{\alpha,1},\gamma_{\alpha,2},\gamma_{\alpha,3}\rightarrow0$ . The proof is rather lengthy, and makes repeated use of Hölder’s inequality and of properties of integrals involving regularly varying functions (in particular Lemma B.5). An application of [Reference Last, Peccati and Schulte24, Theorem 1.1] then implies that $F_\alpha\to\mathcal N(0,1)$ .
5. Related work and discussion
Veitch and Roy [Reference Veitch and Roy38] proved that Equation (22) holds in probability, under slightly different assumptions: they assume that Assumption 2 holds with $a=1$ and that $\mu$ is differentiable, with some conditions on the derivative, but do not make any assumption on the existence of $\sigma$ or $\ell$ . We note that for all the examples considered in Section 6, Assumptions 1 and 2 are always satisfied, but Assumption 2 does not hold with $a=1$ for the non-separable graphon function (40). Additionally, the differentiability condition does not hold for some standard graphon models, such as the stochastic block-model. Borgs et al. [Reference Borgs, Chayes, Cohn and Holden9] proved, amongst other results, the almost sure convergence of the subgraph counts in graphex models (Theorem 156). For the subclass of graphon models defined by Equation (41), [Reference Caron and Fox12] provided a lower bound on the growth in the number of nodes, and therefore an upper bound on the sparsity rate, using assumptions of regular variation similar to Assumption 1. Applying the results derived in this section, we show in Section 6.5 that the bound is tight, and we derive additional asymptotic properties for this particular class.
As mentioned in the introduction, another class of (non-projective) models that can produce sparse graphs are sparse graphons [Reference Bickel and Chen4, Reference Bickel, Chen and Levina5, Reference Bollobás and Riordan8, Reference Wolfe and Olhede41]. In particular, a number of authors have considered the sparse graphon model in which two nodes i and j in a graph of size n connect with probability $\rho_n W(U_i,U_j)$ , where $W\,:\,[0,1]^2\to [0,1]$ is the graphon function, measurable and symmetric, and $\rho_n\to 0$ . Although such a model can capture sparsity, it has rather different properties from graphex models. For example, the global clustering coefficient for this sparse graphon model converges to 0, while the clustering coefficient converges to a positive constant, as shown in Proposition 5.
Also, graphex processes include as a special case dense vertex-exchangeable random graphs [Reference Aldous1, Reference Diaconis and Janson15, Reference Hoover19, Reference Lovász and Szegedy28], that is, models based on a graphon on [0, 1]. They also include as a special case the class of graphon models over more general probability spaces [Reference Bollobás, Janson and Riordan7]; see [Reference Borgs, Chayes, Cohn and Holden9, p. 21] for more details. Some other classes of graphs, such as geometric graphs arising from Poisson processes in different spaces [Reference Penrose34], cannot be cast in this framework.
6. Examples of sparse and dense models
We provide here some examples of the four different cases: dense, almost dense, sparse, and almost extremely sparse. We also show that the results of the previous section apply to the particular model studied by [Reference Caron and Fox12].
6.1. Dense graph
Let us consider the graphon function
which has bounded support. The corresponding marginal graphon function $\mu(x)= \mathbb{1}_{x\leq1} (1-x)/2$ has inverse $\mu^{-1}(x)=\ell(1/x)$ , where $\ell(1/x)=(1-2x)\mathbb{1}_{x\leq1/2}$ is slowly varying since $\ell(1/x)\rightarrow 1$ . Assumptions 1 and 2 are satisfied, so by Theorem 2 and Corollary 1,
almost surely as $\alpha\rightarrow\infty$ . The function W is separable and $C_\alpha^{(g)}\to 4/9$ .
6.2. Sparse, almost dense graph without power-law behaviour
Consider the graphon function
considered by [Reference Veitch and Roy38], which has full support. The corresponding function $\mu(x)=e^{-x}$ has inverse $\mu^{-1}(x)=\ell(1/x)=\log\!(1/x)\mathbb{1}_{0<x<1}$ , which is a slowly varying function. We have $\ell_0^*(x)=1/\log\!(x)^2$ . Assumptions 1 and 2 are satisfied, and
The function W is separable, and $C_\alpha^{(g)}\to 1/4$ .
6.3. Sparse graphs with power-law behaviour
We consider two examples here, one separable and one non-separable. Interestingly, while the degree distributions in the two examples have similar power-law behaviours, the clustering properties are very different. In the first example, the local clustering coefficient converges to a strictly positive constant, while in the second example it converges to 0.
Separable example. First, consider the function
with $\sigma\in(0,1)$ . We have $\mu(x)=\sigma (x+1)^{-1/\sigma}/(1-\sigma)$ , $\mu^{-1}(x)=x^{-\sigma}(1/\sigma-1)^{-\sigma}-1$ , $\ell(t)\sim (1/\sigma-1)^{-\sigma}$ , and $\ell_\sigma^*(t)\sim \left \{(1/\sigma-1)^{-\sigma}\Gamma(1-\sigma) \right \}^{-2/(1+\sigma)}$ . Assumptions 1 and 2 are satisfied. We have $N_\alpha\sim \alpha^{1+\sigma}\Gamma(1-\sigma)(1/\sigma-1)^{-\sigma}$ , ${N_{\alpha}^{(e)}}\sim \alpha^2 \sigma^2/\{2(1-\sigma)^2\}$ , and
The function is separable, and for $\sigma\in(0,1)$ we obtain
Non-separable example. Consider now the non-separable function
where $\sigma\in(0,1)$ . We have $\mu(x)=\sigma (x+1)^{-1/\sigma}$ , $\mu^{-1}(x)=\sigma^{\sigma} x^{-\sigma}-1$ , $\ell(t)\sim \sigma^{\sigma}$ , and $\ell_\sigma^*(t)\sim \left \{\sigma^\sigma\Gamma(1-\sigma) \right \}^{-2/(1+\sigma)}$ . Assumptions 1 and 2 are satisfied as for all $(x,y)\in\mathbb R_+^2$ ,
We have $N_\alpha\sim \alpha^{1+\sigma}\Gamma(1-\sigma)\sigma^\sigma$ , ${N_{\alpha}^{(e)}}\sim \alpha^2 \sigma^2/\{2(1-\sigma)\}$ , and
We have $\int \mu(x)^2dx=\frac{\sigma^3}{2-\sigma}$ . There is no analytical expression for $\int W(x,y)W(y,z)$ $W(x,z)dxdydz$ , but this quantity can be evaluated numerically, and is non-zero, so the global clustering coefficient converges almost surely to a non-zero constant for any $\sigma\in(0,1)$ . For the local clustering coefficient, we have $\mu(x)^2\sim \sigma^2x^{-2/\sigma}$ as $x\to\infty$ and
Hence the local clustering coefficients $C_{\alpha,j}^{(\ell)}$ converge in probability to 0 for all j.
6.4. Almost extremely sparse graph
Consider the function
We have $\overline W=1$ , $\mu(x)=(x+1)^{-1}(1+\log\!(1+x))^{-2}$ , and, using properties of inverses of regularly varying functions, $\mu^{-1}(x)\sim x^{-1}\ell(1/x)$ as $x\rightarrow 0$ , where $\ell(t)=\log\!(t)^{-2}$ is a slowly varying function. For $t>1$ we have $\ell_1(t)=\int_t^\infty x^{-1}\ell(x)dx=1/\log\!(t)$ and $\ell_1^*(t)\sim \log\!(t)/2.$ Assumptions 1 and 2 are satisfied, and almost surely
We have $\int\mu(x)^2dx=\frac{1}{6}(2+e\text{Ei}({-}1))\simeq 0.24$ , where Ei is the exponential integral; hence $C_\alpha^{(g)}\to 0.0576$ almost surely.
6.5. Model of [Reference Caron and Fox12]
The paper [Reference Caron and Fox12] studied a particular subclass of non-separable graphon models. This class is very flexible and allows one to span the whole range of sparsity and power-law behaviours described in Section 3. As shown by [Reference Caron and Fox12], efficient Monte Carlo algorithms can be developed for estimating the parameters of this class of models. Additionally, [Reference Borgs, Chayes, Dhara and Sen11, Corollary 1.3] recently showed that this class is the limit of some sparse configuration models, providing further motivation for the study of their mathematical properties.
Let $\rho$ be a Lévy measure on $(0,+\infty)$ and $\overline{\rho}(x)=\int_{x}^{\infty}\rho(dw)$ the corresponding tail Lévy intensity with generalised inverse $\overline{\rho}^{-1}(x)=\inf\{u>0|\overline{\rho}(u)<x\}$ . The paper [Reference Caron and Fox12] introduced the model defined by
The quantity $w=\overline \rho^{-1}(x)$ can be interpreted as the sociability of a node with parameter x. The larger this value, the more likely the node is to connect to other nodes. The tail Lévy intensity $\overline \rho$ is a monotone decreasing function; its behaviour at zero will control the behaviour of low-degree nodes, while its behaviour at infinity will control the behaviour of high-degree nodes.
The following proposition formalises this and shows how the results of Sections 3 and 4 apply to this model. Its proof is given in Section 6.6.
Proposition 11. Consider the graphon function W defined by Equation (41) with Lévy measure $\rho$ and tail Lévy intensity $\overline\rho$ . Assume $m=\int_0^\infty w\rho(dw)<\infty$ and
for some $\sigma\in [0,1]$ and some slowly varying function $\widetilde\ell$ . Then Equation (3) and Assumptions 1 and 2 hold, with $a=1$ and $\ell(x)=(2m)^\sigma \widetilde \ell(x).$ Proposition 1, Theorems 1 and 2, and Corollary 1 therefore hold. If $\int_0^\infty \psi(2w)^2\rho(dw)<\infty$ , where $\psi(t)=\int(1-e^{-wt})\rho(dw)$ is the Laplace exponent, then the global clustering coefficient converges almost surely,
and when $\sigma\in(0,1)$ , Proposition 6 holds and for any $j\geq 2$ we have
almost surely. For a given subgraph F, the CLT for the number of such subgraphs (Proposition 7) holds if $\int \psi\big(2\overline\rho^{-1}(x)\big)^{2|F|-2}dx<\infty$ . Under Assumption 1, this condition always holds if $\sigma=0$ ; for $\sigma\in(0,1]$ , it holds if $\overline \rho(x)=O\big(x^{-(2|F|-2)\sigma-\epsilon}\big)$ as $x\to\infty$ for some $\epsilon>0$ . In this case, we have
Moreover, if $\int w^6\rho(dw)<\infty$ , then Assumptions 4 and 5 also hold. It follows that Theorems 3, 4, and 5 apply, and for any $\sigma\in[0,1]$ and any $\ell$ ,
Finally, assume $\sigma\in(0,1)$ and $\widetilde\ell(t)=c>0$ . If additionally
for some $\tau>0$ , $c_0>0$ , then Assumption 3 is also satisfied with $\tau>0$ , $\ell_2(x)=\frac{c_0}{2^{\sigma\tau}c^\tau\Gamma(1-\sigma)^\tau}$ , and Proposition 2 applies; that is, for fixed $\alpha$ ,
We consider below two specific choices of mean measures $\rho$ . The two measures have similar properties for large graph size $\alpha$ , but different properties for large degrees j.
Generalised gamma measure. Let $\rho$ be the generalised gamma measure
with $\tau_0>0$ and $\sigma_0\in({-}\infty,1)$ . The tail Lévy intensity satisfies
as $x\to 0$ . Then for $\sigma_0\in(0,1)$ (sparse with power law),
For $\sigma_0=0$ (sparse, almost dense), we have ${N_{\alpha}^{(e)}}\asymp N^2_\alpha/\log\!(N_\alpha)^2$ and $N_{\alpha,j}/N_\alpha\rightarrow 0$ for $j \geq 1;$ for $\sigma_0<0$ (dense), we have ${N_{\alpha}^{(e)}}\asymp N^2_\alpha$ and $N_{\alpha,j}/N_\alpha\rightarrow 0$ for $j \geq 1$ , almost surely, as $\alpha$ tends to infinity. The constants in the asymptotic results are omitted for simplicity of exposition, but they can also be obtained from the results of Section 3. We have $\int w^p\rho(dw)<\infty$ for all $p\geq 1$ ; hence the global clustering coefficient converges, and the CLT applies for the number of subgraphs and the number of nodes. Note that Equation (45) is not satisfied, as the Lévy measure has exponentially decaying tails, and Proposition 2 does not apply. The asymptotic properties of this model are illustrated in Figure 2 for $\sigma_0=0.2$ and $\tau_0=2$ (sparse, power-law regime).
Generalised gamma-Pareto measure. Consider the generalised gamma-Pareto measure, introduced by [Reference Ayed, Lee and Caron2, Reference Ayed, Lee and Caron3]:
where $\gamma(s,x)=\int_0^x u^{s-1}e^{-u}du$ is the lower incomplete gamma function, $c>0$ , $\tau>1$ , $\sigma\in(0, 1)$ . The tail Lévy intensity satisfies
where
It is regularly varying at both zero and infinity, and it satisfies (42) and (45). We therefore have, almost surely,
Proposition 2 applies, and for large-degree nodes,
The global clustering coefficient converges if $\tau>2$ ; the CLT applies for the number of subgraphs F if $\tau>2|F|-2$ , and for the number of nodes if $\sigma\tau>6$ .
6.6. Proof of Proposition 11
The marginal graphon function is given by $\mu(x)=\psi(2\overline \rho^{-1}(x))$ where $\psi(t)=\int_0^\infty (1-e^{-wt})\rho(dw)$ is the Laplace exponent. Its generalised inverse is given by $\mu^{-1}(x)=\overline\rho(\psi^{-1}(x)/2).$ The Laplace exponent satisfies $\psi(t)\sim m t$ as $t\to 0$ . It therefore follows that $\mu^{-1}$ satisfies Assumption 1 with $\ell(x)=(2m)^\sigma \widetilde \ell(x).$ Ignoring loops, the model is of the form given by Equation (18) with $f(x)=2m \overline{\rho}^{-1}(x)$ . Assumption 2 is therefore satisfied. Regarding the global clustering coefficient, $\int \psi(2w)^2\rho(dw)\leq 4\int w^2\rho(dw)<\infty$ , so its limit is finite. For the local clustering coefficient, using dominated convergence and the inequality $\frac{1-e^{-2\overline{\rho}^{-1}(x)y}}{2\overline{\rho}^{-1}(x)}\leq y$ , we obtain
Using the fact that $\mu(x)=\psi(2\overline \rho^{-1}(x))\sim 2m\overline \rho^{-1}(x)$ as $x\to\infty$ , we obtain the result. Finally, if $\overline\rho$ satisfies (42), then $\psi(t)\sim \Gamma(1-\sigma)\widetilde \ell(t)t^\sigma$ as $t\to\infty$ . Using [Reference Bingham, Goldie and Teugels6, Proposition 1.5.15], we have
as $t\to\infty$ , where $\widetilde{\ell}^{\#}$ is the de Bruijn conjugate of $\widetilde{\ell}$ . We obtain $\psi^{-1}(t)=\ell_3\big(t^{1/\sigma}\big)t^{\frac{1}{\sigma}},$ where $\ell_3$ is a slowly varying function with $\ell_3\big(t^{1/\sigma}\big)\sim\widetilde{\ell}^{\#1/\sigma}\big(t^{1/\sigma}\big)\Gamma(1-\sigma)^{-1/\sigma}\text{ as }t\rightarrow\infty.$ We therefore have $ \mu^{-1}(t) \sim c_0 2^{-\tau\sigma}\ell_3\big(t^{1/\sigma}\big)^{\sigma\tau} t^{\tau}\text{ as }t\to\infty.$ If $\widetilde\ell(t)=c$ , then $\ell_3(t)=(c\Gamma(1-\sigma))^{-1/\sigma}$ .
For the CLT for the number of subgraphs F to hold, we need $\int_0^\infty \mu(x)^{2|F|-2}dx<\infty$ . As $\mu$ is monotone decreasing and integrable, we only need $\mu(x)^{2|F|-2}=\psi\big(2\overline\rho^{-1}(x)\big)^{2|F|-2}$ to be integrable in a neighbourhood of 0. In the dense case, $\psi(t)$ is bounded, and the condition holds. If $\overline\rho$ satisfies (42), then $\psi(t)\sim \Gamma(1-\sigma)\widetilde \ell(t)t^\sigma$ as $t\to\infty$ . For $\sigma\in(0,1]$ (sparse regime), the condition holds if $\overline\rho(x)=O\big(x^{-(2|F|-2)\sigma-\epsilon}\big)$ as $x\to\infty$ for some $\epsilon>0$ .
We now check the assumptions for the CLT for the number of nodes. Noting again that $\mu(x)\sim 2m\overline\rho^{-1}(x)$ as $x\to\infty$ , we have, using the inequality $1-e^{-x}\leq x$ ,
where
as $x\to\infty$ . Using now the inequality $1-e^{-x}\geq xe^{-x}$ , we have
As
as $\min(x,y)\to\infty$ , there exist $C_0=2\int w^2\rho(dw)$ and $x_0$ such that for all $x,y>x_0$ , $\nu(x,y)\geq C_0\mu(x)\mu(y)$ .
More generally, if $\int w^6\rho(dw)<\infty$ , then for any $j\leq 6$ ,
where
as $x\to\infty$ . Note also that
7. Sparse and dense models with local structure
In this section, we develop a class of models which allows us to control separately the local structure, for example the presence of communities or particular subgraphs, and the global sparsity/power-law properties. The class of models introduced can be used as a way of sparsifying any dense graphon model.
7.1. Statement of the results
By Kallenberg’s representation theorem, any exchangeable point process can be represented by Equation (2). However, it may be more suitable to use a different formulation where the function W is defined on a general space, not necessarily $\mathbb R_+^2$ , as discussed by [Reference Borgs, Chayes, Cohn and Holden9]. Such a construction may lead to more interpretable parameters and easier inference methods. Indeed, a few sparse vertex-exchangeable models, such as the models of [Reference Herlau, Schmidt and Mørup18] or [Reference Todeschini, Miscouridou and Caron37], are written in such a way that it is not straightforward to express them in the form given by (2).
In this section we show that the above results easily extend to models expressed in the following way. Let F be a probability space. Writing $\vartheta = (u,v)\in \mathbb R_+\times F$ , let $\xi(d\vartheta)=du G(dv)$ where G is some probability distribution on F. Consider models expressed as in (1) with
where $(\theta_k,\vartheta_k)_{k=1,2,\infty}$ are the points of a Poisson point process with mean measure $d\theta\xi(d \vartheta)$ on $\mathbb R_+\times (\mathbb R_+ \times F)$ . Let us assume additionally that the function W factors in the following way:
where $\omega\,:\,F\times F\rightarrow [0,1]$ and the function $\eta\,:\,\mathbb R_+\times\mathbb R_+\rightarrow [0,1]$ is integrable. In this model, $\omega$ can capture the local structure, as in the classical dense graphon, and $\eta$ the sparsity behaviour of the graph. Let $\mu_\eta(u)=\int_0^\infty \eta(u,u^{\prime})du^{\prime}$ , $\mu_\omega (v) = \int_F\omega(v,v^{\prime})G(dv^{\prime})$ , and $\nu_\eta(x,y)=\int_{\mathbb R_+^2} \eta(x,z)\eta(y,z)dz$ . The results presented in Section 3 remain valid when $\mu_\eta$ and $\nu_\eta$ satisfy Assumptions 1 and 2. The proof of Proposition 12 is given in Section 7.2.
Proposition 12. Consider the model defined by Equations (47) and (48) and assume that the functions $\mu_\eta$ and $\nu_\eta$ satisfy Assumptions 1 and 2. Then the conclusions of Proposition 1 hold, and so do the conclusions of Theorems 1 and 2, with $\ell(\alpha)$ and $\ell_1(\alpha)$ replaced respectively by
Consider for example the following class of models for dense and sparse stochastic block-models.
Example 2. (Dense and sparse stochastic block-models.) Consider $F=[0,1]$ and G the uniform distribution on [0, 1]. We choose for $\omega$ the graphon function associated to a (dense) stochastic block-model. For some partition $A_1,\ldots,A_p$ of [0, 1] and any $v,v^{\prime}\in[0,1]$ , let
with $v\in A_k$ , $v^{\prime}\in A_\ell$ , and B a $p\times p$ matrix where $B_{k,\ell}\in[0,1]$ denotes the probability that a node in community k forms a link with a node in community $\ell$ . Then $\omega$ defines the community structure of the graph, and $\eta$ will tune its sparsity properties. Choosing $\eta(x,y)=\mathbb{1}_{x\leq 1}\mathbb{1}_{y\leq 1}$ yields the dense, standard stochastic block-model; choosing $\eta(x,y)=\exp({-}x-y)$ yields a sparse stochastic block-model without power-law behaviour; and so on. Figure 4 gives an illustration of the use of this model to obtain sparse stochastic block-models with power-law behaviour, generalising the model of Section 6.3. The function $\omega$ is defined by
and $\eta(x,y)=(1+x)^{-1/\sigma}(1+y)^{-1/\sigma}$ , with $\sigma=0.8$ .
More generally, one can build on the large literature on (dense) graphon/exchangeable graph models, and combine these models with a function $\eta$ satisfying Assumptions 1 and 2, such as those described in the previous section, in order to sparsify a dense graphon and control its sparsity/power-law properties.
Remark 5. We can also obtain asymptotic results for those functions W that do not satisfy the separability condition (48). Let $\mu(u,v)=\int_{\mathbb R_+\times F} W((u,v),(u^{\prime},v^{\prime}))du^{\prime}dv^{\prime}$ . Assume that for each fixed v there exists $u_0(v)>0$ such that, for $u>u_0$ ,
where $\tilde\mu_\omega\,:\,F\rightarrow \mathbb R_+$ , $\tilde\mu_\eta\,:\,\mathbb R_+ \rightarrow \mathbb R_+$ with $\tilde\mu_\eta(u)=\int_0^\infty \tilde\eta(u,u^{\prime})du^{\prime}$ for some positive function $\tilde\eta$ , and $C_3>0$ and $C_4>0$ . Assume that $\tilde\mu_\eta$ and $\tilde\nu_\eta$ satisfy Assumptions 1 and 2. Then the results of Theorems 1 and 2 and Corollary 1 hold up to a constant. For example, for $\sigma\in[0,1]$ we have ${N_{\alpha}^{(e)}}\asymp N_\alpha^{2/(1+\sigma)}\ell_\sigma^*(N_\alpha)$ almost surely as $\alpha$ tends to infinity. In particular, the inequality from (50) is satisfied if
The models developed by [Reference Herlau, Schmidt and Mørup18, Reference Todeschini, Miscouridou and Caron37] for capturing (overlapping) communities fit into this framework. Ignoring loops, both models can be written in the form given by Equation (51) with $\tilde\eta(u,u^{\prime})=2\overline\rho^{-1}(u)\overline\rho^{-1}(u^{\prime})$ , where $\rho$ is a Lévy measure on $(0,+\infty)$ and $\overline{\rho}(x)=\int_{x}^{\infty}\rho(dw)$ is the tail Lévy intensity with generalised inverse $\overline{\rho}^{-1}(x)$ . When $\tilde \omega$ is given by Equation (49), it corresponds to the (dense) stochastic block-model graphon of [Reference Herlau, Schmidt and Mørup18], and when $\tilde \omega(v_i,v_j)=v_i^{T}v_j$ with $v_i\in \mathbb R_+^p$ , it corresponds to the model of [Reference Todeschini, Miscouridou and Caron37]. For instance, let $\rho$ be the mean measure from Equation (46) with parameters $\tau_0>0$ and $\sigma_0\in({-}\infty,1)$ . Then for $\sigma_0\in(0,1)$ , the corresponding sparse regime with power law for this graph is given by
For $\sigma_0=0$ (sparse, almost dense regime), ${N_{\alpha}^{(e)}}\asymp N^2_\alpha/\log\!(N_\alpha)^2$ and $N_{\alpha,j}/N_\alpha\rightarrow 0$ for $j\geq 1$ ; for $\sigma_0<0$ (dense regime), ${N_{\alpha}^{(e)}}\asymp N^2_\alpha$ and $N_{\alpha,j}/N_\alpha\rightarrow 0$ for $j \geq 1$ , almost surely, as $\alpha$ tends to infinity.
7.2. Proof of Proposition 12
The proofs of Proposition 1 and Theorems 1 and 2 hold with x replaced by $(u,v) \in \mathbb R_+ \times F$ , $dx = du G(dv)$ , and $\mu(x) = \mu_\eta(u)\mu_\omega(v)$ . We thus need only prove that if $\eta$ satisfies Assumptions 1 and 2, then Lemmas B.2, B.3, and B.4 in the appendix hold. Recall that $\mu(x) = \mu_\eta (u) \mu_\omega(v) $ , for $x=(u,v)$ . Then, for all v such that $\mu_\omega(v) >0$ , we apply Lemma B.2 to get
For all v such that $\mu_\omega(v) >0$ , this leads to
To prove that there is convergence in $L_1(G)$ , note that if $\mu_\omega(v) >0$ , since $\mu_\omega \leq 1$ , we have
Moreover,
thus the Lebesgue dominated convergence theorem implies
when $\sigma <1$ , while when $\sigma=1$ ,
The same reasoning is applied to the integrals
To verify Lemma B.3, note that
so that the Lebesgue dominated convergence theorem also leads to
and the control of the integrals $\int_{\mathbb R_+\times F} \{t\mu(u,v)\}e^{-t \mu(u,v)}duG(dv )$ as in Lemma B.4.
8. Conclusion
In this article, we derived a number of properties of graphs based on exchangeable random measures. We related the sparsity and power-law properties of the graphs to the regular variation properties of the marginal graphon function, identifying four different regimes, from dense to almost extremely sparse. We derived asymptotic results for the global and local clustering coefficients. We derived a central limit theorem for the number of nodes $N_\alpha$ in the sparse and dense regimes, and for the number of nodes of degree greater than j in the dense regime. We conjecture that a CLT also holds for $N_{\alpha,j}$ in the sparse regime, under assumptions similar to Assumptions 4 and 5, and that a (lengthy) proof similar to that of Theorem 5 could be used. We leave this for future work.
Appendix A. Proofs of Theorem 1 and Proposition 6
Let $g_{\alpha,x}(\theta,\vartheta)$ be defined, for any $\alpha,x,\theta,\vartheta>0$ , by
A.1. Proof of Theorem 1
The mean number of nodes is
see [Reference Veitch and Roy38, Theorem 5.4]. By the Lebesgue dominated convergence theorem, we have $\alpha\int_{\mathbb{R}_{+}}W(x,x)e^{-\alpha\mu(x)}dx=o(\alpha)$ . Using Lemma B.2, as $\alpha $ goes to infinity we have $\int_{\mathbb{R}_{+}}(1-e^{-\alpha\mu(x)})dx \sim\alpha^{\sigma}\ell(\alpha)\Gamma(1-\sigma)$ for $\sigma\in[0,1)$ and $\int_{\mathbb{R}_{+}}\{1-e^{-\alpha\mu(x)}\}dx \sim\alpha\ell_1(\alpha)$ for $\sigma=1$ . It follows that as $\alpha $ goes to infinity,
The mean number of nodes of degree j is
see [Reference Veitch and Roy38, Theorem 5.5]. Lemma B.3 implies that
and from Lemma B.2, when $\sigma\in[0,1)$ , we have
If $\sigma=1$ , from Lemma B.2 we have
and for $j \geq 2$ ,
Finally, for $\sigma\in[0,1)$ we obtain
and for $\sigma=1$ we obtain $ E(N_{\alpha,1}) \sim \alpha^{2}\ell_1(\alpha)$ and $ E\big(N_{\alpha,j}\big) \sim \alpha^{2}/\{j(j-1)\}\ell(\alpha)$ for $ j\geq 2.$
A.2. Proof of Proposition 6
For $j\geq 1$ , define
Then $R_{\alpha j}$ corresponds to the number of triangles having a node of degree j as a vertex, where triangles having $k\leq 3$ degree-j nodes as vertices are counted k times. We therefore have
The proof for the asymptotic behaviour of the local clustering coefficients $C_{\alpha,j}^{(\ell)}$ is organised as follows. We first derive a convergence result for $E(R_{\alpha j})$ . This result is then extended to an almost sure result. The extension requires some additional work as $R_{\alpha j}$ is not monotone, and $\sum_{j\geq k} R_{\alpha k}$ is monotone but not of the same order as $R_{\alpha j}$ ; hence a proof similar to that for $N_{\alpha, j}$ (see Section 3.2) cannot be used. The almost sure convergence results for $C_{\alpha,j}^{(\ell)}$ and $\overline C_{\alpha}^{(\ell)}$ then follow from the almost sure convergence result for $R_{\alpha j}$ .
We have
and
where $g_{\alpha,x}(\theta,\vartheta)$ is defined in Equation (52). Applying the Slivnyak–Mecke theorem, we obtain
Note that under Assumption 1 with $\sigma\in(0,1)$ , $\mu(x)>0$ for all x. The leading term in the right-hand side of Equation (A.2) is the first term. We have therefore
where
As $\lim_{x\to\infty}W(x,x)=0$ , the condition (29) implies $\lim_{x\to\infty}L(x)= b$ .
Case $b>0$ . Assume first that $b>0$ . In this case, L is a slowly varying function by assumption. Therefore, using Lemma B.5, we have, under Assumption 1, for $\sigma\in(0,1)$ ,
as $\alpha$ tends to infinity. Hence
as $\alpha$ tends to infinity. In order to obtain a convergence in probability, we state the following proposition, whose proof is given in Section A.3 in the supplementary material [Reference Caron, Panero and Rousseau13] and is similar to that of Proposition 4.
Proposition A.1. Under Assumptions 1 and 2, with $\sigma\in[0,1]$ , slowly varying function $\ell$ , and positive scalar a satisfying (17), we have
and for any sequence $\alpha_n$ going to infinity such that $\alpha_{n+1} - \alpha_n = o(\alpha_n)$ ,
We now want to find a subsequence $\alpha_n$ along which the convergence is almost sure. Using Chebyshev’s inequality and the first part of Proposition A.1, there exist $n_0\ge 0$ and $C\ge 0$ such that for all $n>n_0$ ,
Now, if Assumption 2 is satisfied for a given $a>1/2$ , consider the sequence
so that $\sum_n \alpha_n^{1-2a} < +\infty$ and
Therefore, using the Borel–Cantelli lemma, we have
almost surely as $n\to\infty$ .
The goal is now to extend this result to $R_{\alpha j}$ , by sandwiching. Let $I_\alpha \,:\!=\, \{ i \,:\, \theta_i \leq \alpha\}$ . We have the following upper and lower bounds for $R_{\alpha j}$ :
Considering the upper bound of (58), we have
where
We can bound the lower bound of (58) by
The following lemma, proved in Section A.4 of the supplementary material [Reference Caron, Panero and Rousseau13], provides an asymptotic bound for the remainder term $\widetilde R_{nj}$ .
Lemma A.1. Let $\widetilde R_{nj}$ be defined as in Equation (60). If Assumptions 1 and 2 hold with $\sigma\in(0,1)$ and slowly varying function $\ell$ , and the condition (29) is satisfied with $b>0$ , then we have
almost surely as $\alpha$ tends to infinity
Combining Lemma A.1 with the inequalities (58), (59), and (61), and the fact that $R_{\alpha_n j}\sim R_{\alpha_{n+1} j}\asymp \alpha_n^{1+\sigma}\ell(\alpha_n)$ almost surely as $n\to\infty$ , we obtain by sandwiching
Recalling that $N_{\alpha,j}\sim\frac{\sigma\Gamma(j-\sigma)}{j!}\alpha^{1+\sigma}\ell(\alpha)$ almost surely, we have, for any $j\geq1$ ,
Finally, as $\frac{N_{\alpha,j}}{N_{\alpha}-N_{\alpha,1}}$ converges to a constant $\pi_{j}\in(0,1)$ almost surely for any j, we have, using Toeplitz’s lemma,
almost surely as $\alpha$ tends to infinity.
Case $b=0$ . In the case $L(x)\rightarrow0$ , Lemma B.5 gives $\int_{0}^{\infty}L(x)\mu(x)^{j}e^{-\alpha\mu(x)}dx=o(\alpha^{\sigma-j})$ ; hence, by Markov’s inequality,
and $C_{\alpha j}^{(\ell)}\rightarrow 0$ in probability as $\alpha$ tends to infinity.
Appendix B. Technical lemma
The proof of the following lemma is similar to the proof of Proposition 2 in [Reference Gnedin, Hansen and Pitman17], and is omitted here.
Lemma B.1. Let $(X_{t})_{t\geq0}$ be some positive monotone increasing stochastic process with finite first moment $(E(X_{t}))_{t\geq0}\in RV_{\gamma}$ , where $\gamma\geq0$ (see Definition C.1). Assume
for some $a>0$ . Then
The following lemma is a compilation of results from Propositions 17, 18, and 19 in [Reference Gnedin, Hansen and Pitman17].
Lemma B.2. Let $\mu\,:\,\mathbb{R}_{+}\rightarrow\mathbb{R}_{+}$ be a positive, right-continuous, and monotone decreasing function with $\int_0^\infty \mu(x)dx<\infty$ and generalised inverse $\mu^{-1}(x)=\inf\{ y> 0\mid f(y)\leq x\}$ satisfying
where $\sigma\in\lbrack0,1]$ and $\ell$ is a slowly varying function. Consider
Then, for any $\sigma\in\lbrack0,1)$ ,
and for $r\geq 1$ ,
as $t\rightarrow\infty.$ For $\sigma=1$ , as $t\rightarrow\infty$ ,
where $\ell_1(t)=\int_t^\infty x^{-1} \ell(x)dx$ . Note that $\ell(t)=o(\ell_1(t))$ ; hence $g_r(t)=o\big\{t^{1-r}\ell_1(t)\big\}$ .
Lemma B.3. Let $\mu\,:\,\mathbb{R}_{+}\rightarrow\mathbb{R}_{+}$ be a positive, monotone decreasing function, and $u\,:\,\mathbb R_+\rightarrow [0,1]$ a positive and integrable function with $\int_0^\infty u(x)dx<\infty$ . Consider $h_{0}(t)=\int_{0}^{\infty}u(x)(1-e^{-t\mu(x)})dx$ and, for $r\geq 1$ , $h_{r}(t)=\int_{0}^{\infty}u(x)e^{-t\mu(x)}\mu(x)^{r}dx.$ Then, as $t\rightarrow\infty$ ,
Proof. We have $h_{0}(t)\rightarrow \int_0^\infty u(x)dx$ by dominated convergence. Using Proposition C.3, we have
We proceed by induction to obtain the final result.
Lemma B.4. Let $\mu $ be a non-negative, non-increasing function on $\mathbb R_+$ , with $\int_0^{\infty}\mu(x)dx<\infty$ , whose generalised inverse $ \mu^{-1} $ satisfies $ \mu^{-1}(x)\sim x^{-\sigma}\ell(1/x)$ as $x\rightarrow 0$ , with $\sigma \in [0,1]$ and $\ell$ a slowly varying function. Then as $t\rightarrow\infty$ , for all $r> \sigma$ ,
Proof. Let $r>\sigma$ . Let $U(y)=\mu^{-1}(1/y)$ . Then U is non-negative and non-decreasing, with $U(y)\sim y^{\sigma}\ell(y)$ as $y\rightarrow\infty$ . Making the change of variable $x=U(y)$ , one obtains
We follow part of the proof in [Reference Bingham, Goldie and Teugels6, p. 37]. Note that $y\rightarrow y^{-r}\exp({-}t/y)$ is monotone increasing on $[0,t/r]$ and monotone decreasing on $[t/r,\infty)$ . We have
for t large, using the regular variation property of U. Using Potter’s bound [Reference Bingham, Goldie and Teugels6, Theorem 1.5.6], for any $\delta>0$ and for t large we have
Hence, for t large,
Taking $0<\delta<\frac{r-\sigma}{2}$ , we conclude that the series in the right-hand side converges.
The next lemma is a slight variation of Lemma B.2, with the addition of a slowly varying function in the integrals. Note that the case where $\sigma=0$ and $\ell$ tends to a constant is not covered.
Lemma B.5. Let $f\,:\,\mathbb{R}_{+}\rightarrow\mathbb{R}_{+}$ be a positive, right-continuous, and monotone decreasing function with $\int_0^\infty f(x)dx<\infty$ and generalised inverse $f^{-1}(x)=\inf\{ y> 0\mid f(y)\leq x\}$ satisfying
where $\sigma\in[0,1]$ and $\ell$ is a slowly varying function, with $\lim_{t\to\infty}\ell(t)=\infty$ if $\sigma=0$ . Consider
and, for $r\geq 1$ ,
where $L\,:\,\mathbb R_+\rightarrow (0,\infty)$ is a locally integrable function with $\lim_{t\to\infty}L(t)= b\in[0,\infty)$ . Then, for any $\sigma\in\lbrack0,1)$ ,
and for $r\geq 1$ ,
as $t\rightarrow\infty.$ For $\sigma=1$ , $b>0$ , as $t\rightarrow\infty$ ,
where $\ell_1(t)=\int_t^\infty x^{-1} \ell(x)dx$ . Note that $\ell(t)=o(\ell_1(t))$ ; hence $\widetilde g_r(t)=o\{t^{1-r}\ell_1(t)\}$ .
Proof. Let $g_{0}(t)=\int_{0}^{\infty}\big(1-e^{-tf(x)}\big)dx$ . Let $\ell_1(t)=\int_t^\infty x^{-1} \ell(x)dx$ and $\ell_\sigma(t)=\Gamma(1-\sigma)\ell(t)$ if $\sigma\in[0,1)$ . Using Lemma B.2, we have $g_{0}(t)\sim t^{\sigma}\ell_\sigma(t)$ as $t\to\infty$ , and in particular $g_{0}(t)\to\infty$ . By dominated convergence, for any $x_0>0$ , we have
hence, $\widetilde g_{0}(t)\sim \int_{x_0}^{\infty} \big(1-e^{-tf(x)}\big)L(x)dx$ as $t\to\infty$ .
Let $\epsilon>0$ . There exists $x_0$ such that for all $x\geq x_0$ , $|L(x)-b|\leq \epsilon$ and so
Hence, by sandwiching, we have
As this is true for any $\epsilon>0$ , we obtain $\widetilde g_{0}(t)\sim bt^{\sigma}\ell_\sigma(t)\text{ as }t\rightarrow\infty$ if $b>0$ and $\widetilde g_{0}(t)=o(t^{\sigma}\ell_\sigma(t))$ if $b=0$ . The asymptotic results for $\widetilde g_r(t)$ then follow from Proposition C.3.
The following is a corollary of [Reference Willmot40, Theorem 2.1].
Corollary B.1. [Reference Willmot40, Theorem 2.1]. Assume that
where $\ell$ is a slowly varying, locally bounded function on $(0,\infty)$ , and where either $\beta\geq 0$ and $\alpha\in \mathbb R$ , or $\alpha<-1$ and $\beta=0$ . Then, as $n\to\infty$ ,
and
for any locally bounded function u vanishing at infinity.
Proof. Equation (64) is proved in [Reference Willmot40, Theorem 2.1]. For any $x_0>0$ , we have
For any $\epsilon>0$ , there is $x_0$ such that $u(x)<\epsilon$ for all $x>x_0$ ; hence
The following lemma is useful to bound the variance and for the proof of the central limit theorem.
Lemma B.6. Assume the functions $\mu$ and $\nu$ satisfy Assumptions 1 and 2, for some $\sigma\in[0,1]$ and slowly varying function $\ell$ , with $a>\min(1/2,\sigma)$ if $\sigma<1$ and $a=1$ if $\sigma=1$ . Then
where $\ell_\sigma$ is defined in Equation (20). If $a=1$ and $\sigma=0$ we have the stronger result
Proof. Using the fact that $\nu(x,y)\leq \sqrt{\mu(x)\mu(y)}\leq (\mu(x)+\mu(y))/2$ and Assumption 2, we have
where $a>\min(1/2,\sigma)$ if $\sigma<1$ and $a=1$ if $\sigma=1$ . Using $\int_0^{x_0}\nu(x,y) dx\leq x_0\mu(y)$ , we have
if $x_0>0$ (otherwise the bound is trivial). Since $\mu(x_0)>0$ , the right-hand side is in $o(\alpha^{-p})$ for any $p>0$ . Using Lemma B.4 (for $\sigma<1$ ) or 10 (for $\sigma=1$ ) together with Assumption 1, we therefore obtain
In the case $\sigma=0$ and $a=1$ , Lemma B.2 and Assumption 1 give
Appendix C. Background on regular variation and some technical lemmas about regularly varying functions
Definition C.1. A measurable function $U\,:\,\mathbb{R}_{+}\rightarrow\mathbb{R}_{+}$ is regularly varying at $\infty$ with index $\rho\in\mathbb{R}$ if, for $x>0$ , $\lim_{t\rightarrow\infty}U(tx)/U(t)=x^{\rho}.$ We denote this by $U\in RV_{\rho}$ . If $\rho=0$ , we call U slowly varying.
Proposition C.1. If $U\in RV_{\rho}$ , then there exists a slowly varying function $\ell\in RV_{0}$ such that
Definition C.2. The de Bruijn conjugate $\ell^\#$ of the slowly varying function $\ell$ , which always exists, is uniquely defined up to asymptotic equivalence [Reference Bingham, Goldie and Teugels6, Theorem 1.5.13] by
as $x\rightarrow\infty$ . Then $(\ell^\#)^\# \sim\ell$ . For example, $(\!\log^a x)^\#\sim \log^{-a} x$ for $a\neq 0$ and $\ell^\# (x)\sim 1/c$ if $\ell(x)\sim c$ .
Proposition C.2. ([Reference Resnick36, Proposition 0.8, Chapter 0].) If $U\in RV_{\rho}$ , $\rho\in \mathbb R$ , and the sequences $(a_{n})$ and $\big(a_{n}^{\prime}\big)$ satisfy $0<a_{n}\rightarrow\infty$ , $0<a_{n}^{\prime}\rightarrow\infty$ and $a_{n}\sim ca_{n}^{\prime}$ for some $0<c<\infty$ , then
Proposition C.3. ([Reference Resnick36, Proposition 0.7, p. 21].) Let $U\,:\,\mathbb{R}_{+}\rightarrow\mathbb{R}_{+}$ be absolutely continuous with density u, so that $U(x)=\int_0^x u(t)dt$ . If $U\in RV_{\rho}$ , $\rho\in\mathbb{R}$ , and u is monotone, then
furthermore, if $\rho\neq0$ , then $\textrm{sign}(\rho)u(x)\in RV_{\rho-1}$ .
Acknowledgements
The authors thank Zacharie Naulet for helpful feedback and suggestions on an earlier version of this article.
Funding information
The project leading to this work received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 834175). At the start of the project, F. Panero was funded by the EPSRC and MRC Centre for Doctoral Training in Statistical Science (grant code EP/L016710/1).
Data access statement
The data simulated to produce the figures of the paper can be found in the code repository https://github.com/francescapanero/OnSparsity_graphex.git.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.
Supplementary material
The supplementary material for this article can be found at https://dx.doi.org/ 10.1017/apr.2022.75.