Hostname: page-component-5cf477f64f-xc2pj Total loading time: 0 Render date: 2025-04-08T04:28:58.119Z Has data issue: false hasContentIssue false

Arcsine laws for random walks generated from random permutations with applications to genomics

Published online by Cambridge University Press:  22 November 2021

Xiao Fang*
Affiliation:
The Chinese University of Hong Kong
Han L. Gan*
Affiliation:
Northwestern University
Susan Holmes*
Affiliation:
Stanford University
Haiyan Huang*
Affiliation:
University of California, Berkeley
Erol Peköz*
Affiliation:
Boston University
Adrian Röllin*
Affiliation:
National University of Singapore
Wenpin Tang*
Affiliation:
Columbia University
*
*Postal address: Department of Statistics, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Email: xfang@sta.cuhk.edu.hk
**Postal address: University of Waikato, Private Bag 3105, Hamilton 3240, New Zealand. Email: han.gan@waikato.ac.nz
***Postal address: Department of Statistics, 390 Jane Stanford Way, Stanford University, Stanford, CA 94305-4020. Email: susan@stat.stanford.edu
****Postal address: Department of Statistics, University of California, Berkeley, 367 Evans Hall, Berkeley, CA 94720-3860. Email: hhuang@stat.berkeley.edu
*****Postal address: Boston University, Questrom School of Business, Rafik B. Hariri Building, 595 Commonwealth Avenue, Boston, MA 02215. Email: pekoz@bu.edu
******Postal address: Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive 2, Singapore 117546. Email: adrian.roellin@nus.edu.sg
*******Postal address: Department of Industrial Engineering and Operations Research, Columbia University, 500 W. 120th Street #315, New York, NY 10027. Email: wt2319@columbia.edu
Rights & Permissions [Opens in a new window]

Abstract

A classical result for the simple symmetric random walk with 2n steps is that the number of steps above the origin, the time of the last visit to the origin, and the time of the maximum height all have exactly the same distribution and converge when scaled to the arcsine law. Motivated by applications in genomics, we study the distributions of these statistics for the non-Markovian random walk generated from the ascents and descents of a uniform random permutation and a Mallows(q) permutation and show that they have the same asymptotic distributions as for the simple random walk. We also give an unexpected conjecture, along with numerical evidence and a partial proof in special cases, for the result that the number of steps above the origin by step 2n for the uniform permutation generated walk has exactly the same discrete arcsine distribution as for the simple random walk, even though the other statistics for these walks have very different laws. We also give explicit error bounds to the limit theorems using Stein’s method for the arcsine distribution, as well as functional central limit theorems and a strong embedding of the Mallows(q) permutation which is of independent interest.

Type
Original Article
Copyright
© The Author(s), 2021. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

The arcsine distribution appears surprisingly in the study of random walks and Brownian motion. Let $B\;:\!=\;(B_t; \, t \ge 0)$ be one-dimensional Brownian motion starting at zero. Let $G\;:\!=\;\sup\{0 \leq s \leq 1\;:\;B_s = 0\}$ be the last exit time of B from zero before time 1, $G^{\max}\;:\!=\;\inf\{0 \leq s \leq 1\;:\;B_s = \max_{u \in [0,1]} B_{u}\}$ be the first time at which B achieves its maximum on [0,1], and $\Gamma\;:\!=\;\int_0^1 1_{\{B_s > 0\}} \, \textrm{d} s$ be the occupation time of B above zero before time 1. In [Reference Lévy43, Reference Lévy44], Lévy proved the celebrated result that G, $G^{\max}$ , and $\Gamma$ are all arcsine distributed with density

(1.1) \begin{equation} f(x) = \frac{1}{\pi \sqrt{x(1-x)}\,} \qquad \text{for } 0 < x < 1.\end{equation}

For a random walk $S_n\;:\!=\;\sum_{k = 1}^n X_k$ with increments $(X_k; \, k \ge 1)$ starting at $S_0\;:\!=\;0$ , the counterparts of G, $G^{\max}$ , and $\Gamma$ are given by $G_n\;:\!=\;\max\{0 \le k \le n: S_k = 0\}$ , the index at which the walk last hits zero before time n, $G^{\max}_n \;:\!=\;\min\{0 \le k \le n: S_k = \max_{0 \le k \le n}S_k\}$ , the index at which the walk first attains its maximum value before time n, $\Gamma_n\;:\!=\;\sum_{k = 1}^n \textbf{1}[S_k > 0]$ , the number of times that the walk is strictly positive up to time n, and $N_n\;:\!=\;\sum_{k = 1}^{n} \textbf{1}[S_{k-1} \ge 0, \, S_k \ge 0]$ , the number of edges which lie above zero up to time n. The discrete analog of Lévy’s arcsine law was established in [Reference Andersen2], where the limiting distribution (1.1) was computed in [Reference Chung and Feller19, Reference Erdös and Kac26]. Feller [Reference Feller28] gave the following refined treatment:

  1. (i) If the increments $(X_k; \, k \ge 1)$ of the walk are exchangeable with continuous distribution, then $\Gamma_n \stackrel{(\textrm{d})}{=} G_n^{\max}$ .

  2. (ii) For a simple random walk with $\mathbb{P}(X_k = \pm 1) = 1/2$ we have $N_{2n} \stackrel{(\textrm{d})}{=} G_{2n}$ , which follows the discrete arcsine law given by

    (1.2) \begin{equation}\alpha_{2n, 2k}\;:\!=\;\frac{1}{2^{2n}} \binom{2k}{k} \binom{2n - 2k}{n - k} \qquad \text{for } k \in \{0, \ldots, n\}.\end{equation}

In the Brownian scaling limit, the above identities imply that $\Gamma \stackrel{(\textrm{d})}{=} G^{\max} \stackrel{(\textrm{d})}{=} G$ . The fact that $G \stackrel{(\textrm{d})}{=} G^{\max}$ also follows from Lévy’s identity $(|B_t|; \, t \ge 0) \stackrel{(\textrm{d})}{=} (\sup_{s \le t}B_s - B_t; \, t \ge 0)$ . See [Reference Karatzas and Shreve39, Reference Pitman and Yor53, Reference Williams69] and [Reference Rogers and Williams54, Section 53] for various proofs of Lévy’s arcsine law. The arcsine law has been further generalized in several different ways, e.g. in [Reference Bertoin and Doney8, Reference Dynkin25, Reference Getoor and Sharpe30] to L´evy processes; [Reference Barlow, Pitman and Yor4, Reference Bingham and Doney13] to multidimensional Brownian motion; [Reference Akahori1, 62] to Brownian motion with drift; and [Reference Kasahara and Yano40, Reference Watanabe68] to one-dimensional diffusions. See also [Reference Pitman51] for a survey of arcsine laws arising from random discrete structures.

In this paper we are concerned with the limiting distribution of the Lévy statistics $G_n$ , $G_n^{\max}$ , $\Gamma_n$ , and $N_n$ of a random walk generated from a class of random permutations. Our motivation comes from a statistical problem in genomics.

1.1. Motivation from genomics

Understanding the relationship between genes is an important goal of systems biology. Systematically measuring the co-expression relationships between genes requires appropriate measures of the statistical association between bivariate data. Since gene expression data routinely require normalization, rank correlations such as Spearman’s rank correlation [Reference McDonald45, p. 221] have been commonly used; see, for example, [Reference Salari, Tibshirani and Pollack55]. Compared to many other measures, although some information may be lost in the process of converting numerical values to ranks, rank correlations are usually advantageous in terms of being invariant to monotonic transformation, and also robust and less sensitive to outliers. In genomics studies, however, these correlation-based and other kinds of global measures have a practical limitation: they measure a stationary dependent relationship between genes across all samples. It is very likely that the patterns of gene association may change or only exist in a subset of the samples, especially when the samples are pooled from heterogeneous biological conditions. In response to this consideration, several recent efforts have considered statistics that are based on counting local patterns of gene expression ranks to take into account the potentially diverse nature of gene interactions. For instance, denoting the expression profiles for genes X and Y over n conditions (or n samples) by ${\bf x}=(x_1, \dots, x_n)$ and ${\bf y}=(y_1,\dots,y_n)$ respectively, the following statistic, denoted by $W_2$ , was introduced in [Reference Wang, Waterman and Huang66] to consider and aggregate possible local interactions:

\begin{multline*} W_2 =\sum_{1 \le i_1< \dots < i_k \le n} \big( \textbf{1} [\phi(x_{i_1}, \dots,x_{i_k})= \phi(y_{i_1}, \dots,y_{i_k})] \\ + \textbf{1} [\phi(x_{i_1}, \dots,x_{i_k})= \phi(-y_{i_1}, \dots,-y_{i_k})] \big),\end{multline*}

where $\textbf{1}[\cdot]$ denotes the indicator function and $\phi$ is the rank function that returns the indices of elements in a vector after they have been sorted in an increasing order (for example, consider the values $(0.5,1.5,0.2)$ , which after ordering become $(0.2,0.5,1.5)$ , described by the permutation $1\mapsto 2$ , $2\mapsto3$ , $3\mapsto1$ ; applying the same permutation to the vector (1,2,3), we thus obtain $\phi(0.5,1.5,0.2)=(3,1,2)$ , which is just the sequence of positions that, after ordering, indicate where the values were before they were ordered). The statistic $W_2$ aggregates the interactions across all subsamples of size $k \le n$ ; indeed, $W_2$ is equal to the total number of increasing and decreasing subsequences of length k in a suitably permuted sequence. To see this, suppose $\sigma$ is a permutation that sorts the elements of y in a decreasing order. Let ${\bf z}=\sigma({\bf x}) = (z_1, \dots, z_n)$ be that permutation applied to x; then $W_2$ can be rewritten as

\begin{equation*}W_2=\sum_{1 \le i_1< \dots < i_k \le n} \big(\textbf{1}[z_{i_1}< \dots <z_{i_k}] + \textbf{1}[z_{i_1} >\dots > z_{i_k}]\big).\end{equation*}

Several variants of $W_2$ have been studied to detect different types of dependent patterns between x and y (see, for example, [Reference Wang, Waterman and Huang66, Reference Wang, Liu, Theusch, Rotter, Medina, Waterman and Huang67]).

One variant, for example, is to have $k=2$ and consider only increasing patterns in z to assess a negative dependent relationship between x and y. Denoted by $W^*$ , this variant can be simply expressed as $W^*=\sum_{1 \le i_1 < i_2 \le n} \textbf{1}[z_{i_1} < z_{i_2}]$ . If a more specific negative dependent structure is concerned, say gene Y is an active repressor of gene X when the expression level of gene Y is above a certain value, then we would expect a negative dependent relationship between x and y, but with that dependence happening only locally among some vector elements. More specifically, this situation suggests that for a condition/sample, the expression of gene X is expected to be low when the expression of gene Y is sufficiently high, or equivalently, this dependence presents between a pair of elements (with each from x and y respectively) only when the associated element in y is above a certain value. To detect this type of dependent relationship, naturally we may consider the family of statistics $W^*_m = \sum_{i=1}^{m} \textbf{1}[z_i < z_{i+1} ]$ , $1\leq m\leq n-1$ . Note that the elements in y are ordered in a decreasing order. Thus, in this situation that gene Y is an active repressor of gene X when the expression of gene Y is above a certain level, there should exist a change point $m_0$ such that $W_m^*$ is significantly high (in comparison to the null case that x and y are independent) when $m<m_0$ , and the significance would become gradually weakened or disappear as m grows from $m_0$ to n. For mathematical convenience, considering $W^*_m$ is equivalent to considering $T_m = \sum_{i=1}^{m} (2 \textbf{1}[z_{i+1} > z_i] - 1 )$ , $1\leq m\leq n-1$ . As argued above, exploring the properties of this process-level statistic would be useful to understand a ‘local’ negative relationship between x and y that happens only among a subset of vector elements, as well as for detecting when such relationships would likely occur. To the best of our knowledge, the family of statistics $(T_m;\, 1\leq m \leq n-1)$ has not been theoretically studied in the literature. This statistic provides a motivation for studying the related problem of the permutation generated random walk.

1.2. Permutation generated random walk

Let $\pi\;:\!=\;(\pi_1, \ldots, \pi_{n+1})$ be a permutation of $[n+1]\;:\!=\;\{1, \ldots, n+1\}$ . Let

\begin{equation*}X_k\;:\!=\;\begin{cases}\, +1 & \text{if $\pi_k < \pi_{k+1}$,} \\\, -1 & \text{if $\pi_k > \pi_{k+1}$,}\end{cases}\end{equation*}

and denote by $S_n\;:\!=\;\sum_{k=1}^n X_k$ , $S_0\;:\!=\;0$ , the corresponding walk generated by $\pi$ . That is, the walk moves to the right at time k if the permutation has a rise at position k, and the walk moves to the left at time k if the permutation has a descent at position k. An obvious candidate for $\pi$ is the uniform permutation of $[n+1]$ . This random walk model was first studied in [Reference Oshanin and Voituriez49] in the physics literature, and also appeared in the study of zigzag diagrams in [Reference Gnedin and Olshanski32].

In this article, we consider a more general family of random permutations proposed by Mallows [Reference Mallows47], which includes the uniform random permutation. For $0 \leq q \leq 1$ , the one-parameter model

\begin{equation*}\mathbb{P}_{q}(\pi) = \frac{{q}^{\textrm{inv}(\pi)}}{Z_{n,q}} \qquad \text{for}\;\pi\;\textrm{a permutation of}\;[n] \end{equation*}

is referred to as the Mallows(q) permutation of [n], where $\textrm{inv}(\pi)\;:\!=\; \#\{(i,j) \in [n]\;:\;i\,<\,j\;\text{and}\;{\pi}_{i}\,>\,\pi_{j}\}$ is the number of inversions of $\pi$ , and where

\begin{equation*}Z_{n,q}\;:\!=\;\sum_{\pi} q^{\textrm{inv}(\pi)} = \prod_{j=1}^n \sum_{i = 1}^j q^{i-1} = (1-q)^{-n} \prod_{j = 1 }^n ( 1 - q^j)\end{equation*}

is known as the q-factorial. For $q = 1$ , the Mallows(1) permutation is the uniform permutation of [n]. There have been a number of works on this random permutation model; see, for example, [Reference Basu and Bhatnagar6, Reference Diaconis23, Reference Gladkich and Peled31, Reference Gnedin and Olshanski33, Reference Starr60, Reference Tang63].

Question 1.1. For a random walk generated from the Mallows(q) permutation of $[{n+1}]$ , what are the limit laws of the statistics defined at the beginning of Section 1?

For a Mallows(q) permutation of $[n+1]$ , the increments $(X_k; \, 1 \le k \le n)$ are not independent or even exchangeable. Moreover, the associated walk $(S_k; \, 0 \le k \le n)$ is not Markov, and as a result the Andersen–Feller machine does not apply. Indeed, when $q=1$ this random walk has a tendency to change directions more often than a simple symmetric random walk, and thus tends to cross the origin more frequently. Note that the distribution of the walk $(S_k; \, 0 \le k \le n)$ is completely determined by the up–down sequence or, equivalently, by the descent set $\mathcal{D}(\pi)\;:\!=\;\{k \in [n]: \pi_k > \pi_{k+1}\}$ of the permutation $\pi$ . The number of permutations given the up–down sequence can be expressed either as a determinant, or as a sum of multinomial coefficients; see see [Reference MacMahon46, Vol. I] and [Reference Carlitz15, Reference de Bruijn21, Reference Niven48, Reference Stanley58, Reference Viennot65]. In particular, the number of permutations with a fixed number of descents is known as the Eulerian number. See also [Reference Stanley59, Section 7.23], [Reference Borodin, Diaconis and Fulman14, Section 5], and [Reference Chatterjee and Diaconis18] for the descent theory of permutations. None of these results give a simple expression for the limiting distributions of $G_n/n$ , $G^{\max}_n/n$ , $\Gamma_n/n$ , and $N_n/n$ of a random walk generated from the uniform permutation.

2. Main results

To answer Question 1.1, we prove a functional central limit theorem for the walk generated from the Mallows(q) permutation. Although for each $n > 0$ the associated walk $(S_k; \, 0 \le k \le n)$ is not Markov, the scaling limit is Brownian motion with drift. As a consequence, we derive the limiting distributions of the Lévy statistics, which can be regarded as generalized arcsine laws. In the following, let $(S_t; \, 0 \le t \le n)$ be the linear interpolation of the walk $(S_k; \, 0 \le k \le n)$ . That is, $S_{t} = S_{j-1} + (t-j+1)(S_j - S_{j-1})$ for $j-1 \le t \le j$ . See [Reference Billingsley12, Chapter 2] for background on the weak convergence in the space C[0,1]. The result is stated as follows.

Theorem 2.1 Fix $0< q\leq 1$ , and let $(S_k; \, 0 \le k \le n)$ be a random walk generated from the Mallows(q) permutation of $[n+1]$ . Let

(2.1) \begin{equation} \mu\;:\!=\;\frac{1-q}{1+q} , \qquad \sigma\;:\!=\;\sqrt{\frac{4q(1-q+q^2)}{(1+q)^2(1+q+q^2)}}.\end{equation}

Then, as $n \rightarrow \infty$ ,

\begin{equation*} \left(\frac{S_{nt} -\mu nt}{\sigma \sqrt{n}}; \, 0 \le t \le 1 \right) \stackrel{(\textrm{d})}{\longrightarrow} ( B_t; \, 0 \le t \le 1 ),\end{equation*}

where $\stackrel{(\textrm{d})}{\longrightarrow}$ denotes the weak convergence in C[0,1] equipped with the sup-norm topology.

Given the above theorem, it is natural to consider the dragged-down walk $S^{q}_k: = S_k - \mu k$ , $0 \le k \le n$ . Let $G^q_n$ , $G^{q,\max}_n$ , $\Gamma^q_n$ , and $N^q_n$ be the Lévy statistics corresponding to the dragged-down walk. As a direct consequence of Theorem 2.1, the random variables $G^q_n/n$ , $G^{q,\max}_n/n$ , $\Gamma^q_n/n$ , and $N^q_n/n$ all converge to the arcsine distribution whose density is given by (1.1).

The proof of Theorem 2.1 is given in Section 3, and makes use of the Gnedin–Olshanski construction of the Mallows(q) permutation. By letting $q= 1$ , we get the scaling limit of a random walk generated from the uniform permutation, which has recently been proved in the framework of zigzag graphs [64, Proposition 9.1]. For this case, we have the following corollary.

Corollary 2.2 Let $(S_k; \, 0 \le k \le n)$ be a random walk generated from the uniform permutation of $[n+1]$ . Then, as $n \rightarrow \infty$ ,

\begin{equation*} \left(\frac{S_{nt}}{\sqrt{n}\,}; \, 0 \le t \le 1 \right) \stackrel{(\textrm{d})}{\longrightarrow} \left(\frac{1}{\sqrt{3}\,}B_t; \, 0 \le t \le 1 \right),\end{equation*}

where $\stackrel{(\textrm{d})}{\longrightarrow}$ denotes the weak convergence in C[0,Reference Akahori1] equipped with the sup-norm topology. Consequently, as $n \rightarrow \infty$ , the random variables $G_n/n$ , $G^{\max}_n/n$ , and $\Gamma_n/n$ converge in distribution to the arcsine law given by the density (1.1).

Now that the limiting process has been established, we can ask the following question.

Question 2.3. For a random walk generated from the Mallows(q) permutation of $[{n+1}]$ , what are the error bounds between $G^q_n/n$ , $G^{q,\max}/n$ , $\Gamma^q_n/n$ , $N^q_n/n$ , and their arcsine limit?

While we cannot answer these questions directly, we were able to prove partial and related results. To state these, we need some notation. For two random variables X and Y, we define the Wasserstein distance as $d_{\textrm{W}}(X, Y)\;:\!=\;\sup_{h \in \textrm{Lip}(1)} |\mathbb{E}h(X) - \mathbb{E}h(Y)|$ , where $\textrm{Lip}(1)\;:\!=\;\{h : |h(x) - h(y)| \le |x - y|\}$ is the class of Lipschitz-continuous functions with Lipschitz constant 1. For $m \ge 1$ , let $\textrm{BC}^{m,1}$ be the class of bounded functions that have m bounded and continuous derivatives and whose mth derivative is Lipschitz continuous. Let $\Vert h\Vert_{\infty}$ be the sup-norm of h, and if the kth derivative of h exists, let

\begin{equation*} |h|_k\;:\!=\;\left\| \frac{\textrm{d}^k h}{\textrm{d} x^k} \right\|_{\infty} , \qquad |h|_{k,1}\;:\!=\;\sup_{x,y} \Bigg\lvert \frac{\textrm{d}^k h(x)}{\textrm{d} x^k} - \frac{\textrm{d}^k h(y)}{\textrm{d} y^k} \Bigg\lvert \frac{1}{|x - y|}.\end{equation*}

The following results hold true for a simple random walk. However, we have strong numerical evidence that they are also true for the permutation generated random walk; see Conjecture 2.5.

Theorem 2.4 Let $(S_{k}; \, 0 \le k \le 2n)$ be a simple symmetric random walk. Then

(2.2) \begin{equation}\mathbb{P}(N_{2n} = 2k) = \alpha_{2k,2n} \qquad \text{for } k \in \{0, \ldots , n\}.\end{equation}

Moreover, let Z be an arcsine distributed random variable; then

(2.3) \begin{equation}d_{\textrm{W}}\left(\frac{N_{2n}}{2n},Z\right) \le \frac{27}{2n} + \frac{8}{n^2}.\end{equation}

Furthermore, for any $h \in \textrm{BC}^{2,1}$ ,

(2.4) \begin{equation}\Bigg\lvert\mathbb{E}h\left(\frac{N_{2n}}{2n}\right) - \mathbb{E}h(Z) \Bigg\lvert \leq \frac{4 |h|_2 + |h|_{2,1}}{64n} + \frac{|h|_{2,1}}{64n^2}.\end{equation}

Identity (2.2) can be found in [Reference Feller28], the bound (2.3) was proved by [Reference Goldstein and Reinert34], and the proof of (2.4) is deferred to Section 4.

Conjecture 2.5 For a uniform random permutation generated random walk of length $2n+1$ , the probability that there are 2k edges above the origin equals $\alpha_{2n,2k}$ , which is the same as that of a simple random walk (see (1.2)).

For a walk generated from a permutation of $[n+1]$ , call it a positive walk if $N_n = n$ , and a negative walk if $N_n = 0$ . In [Reference Bernardi, Duplantier and Nadeau7] it was proved that the number of positive walks $b_n$ generated from permutations of [n] is $n!!\,(n-2)!!$ if n is odd, and $[(n-1)!!]^2$ if n is even. Computer enumerations suggest that $c_{2k,2n+1}$ , the number of walks generated from permutations of $[2n+1]$ with 2k edges above the origin, satisfies

(2.5) \begin{equation}c_{2k, 2n+1} = \binom{2n+1}{2k} b_{2k} b_{2n-2k+1}.\end{equation}

Note that, for the special cases $k = 0$ and $k=n$ , the formula (2.5) agrees with the known results in [Reference Bernardi, Duplantier and Nadeau7]. The formula (2.5) suggests a bijection between permutations of $[2n+1]$ with 2k positive edges and pairs of permutations of disjoint subsets of $2n + 1$ of respective cardinality 2k and $2n+1-2k$ whose associated descent walks are positive. A naive idea is to break the walk into positive and negative excursions, and exclude the final visit to the origin before crossing to the other side of the origin in each excursion [Reference Andersen2, Reference Bertoin9]. However, this approach does not work since not all pairs of positive walks are obtainable. For example, for $n = 3$ , the pair (1,2,3) and (7,6,5,4) cannot be obtained. If Conjecture 2.5 holds, we get the arcsine law as the limiting distribution of $N_{2n}/2n$ with error bounds.

While we are not able to say much about $G_n$ , $G^{\max}_n$ , and $\Gamma_n$ with respect to a random walk generated from the uniform permutation for finite n, we can prove that the limiting distributions of these Lévy statistics are still arcsine; this is a consequence of the fact that the scaled random walks converge to Brownian motion.

Classical results in [Reference Komlós, Major and Tusnády41, Reference Komlós, Major and Tusnády42, Reference Skorokhod57] provide strong embeddings of a random walk with independent increments into Brownian motion. In view of Theorem 2.1, it is also interesting to understand the strong embedding of a random walk generated from the Mallows(q) permutation. We have the following result.

Theorem 2.6 Fix $0< q\leq 1$ , and let $(S_k; \, 0 \le k \le n)$ be a random walk generated from the Mallows(q) permutation of $[n+1]$ . Let $\mu$ and $\sigma$ be defined by (2.1), and let

\begin{equation*}\beta\;:\!=\;\frac{2}{\sigma(1+q)} , \qquad \eta\;:\!=\;\frac{2q}{1-q+q^2}.\end{equation*}

Then, there exist universal constants $n_0, c_1, c_2 > 0$ such that, for any $\varepsilon \in (0,1)$ and $n \ge n_0$ , we can construct $(S_t; \, 0 \le t \le n)$ and $(B_t; \, 0 \le t \le n)$ on the same probability space such that

\begin{equation*} \mathbb{P}\left(\sup_{0 \le t \le n} | \frac{1}{\sigma} (S_t - \mu t )- B_t | > c_1 n^{\frac{1 + \varepsilon}{4}} (\log n)^{\frac{1}{2}} \beta \right) \le \frac{c_2(\beta^6 + \eta)}{\beta^2 n^{\varepsilon} \log n}.\end{equation*}

In fact, a much more general result, namely a strong embedding for m-dependent random walks, is proved in Section 5.

Also note that there is substantial literature studying the relations between random permutations and Brownian motion. Classical results were surveyed in [Reference Arratia, Barbour and Tavaré3, Reference Pitman50]; see also [Reference Bassino, Bouvel, Féray, Gerin and Pierrot5, Reference Hoffman, Rizzolo and Slivken35, Reference Hoffman, Rizzolo and Slivken36, Reference Janson38] for recent progress on the Brownian limit of pattern-avoiding permutations.

3. Proof of Theorem 2.1

In this section we prove Theorem 2.1. To establish the result, we first show that the Mallows(q) permutation can be constructed from one-dependent increments $(X_1, \ldots,$ $X_n)$ ; that is, $(X_1, \ldots,X_j)$ are independent of $(X_{j+2}, \ldots, X_n)$ for each $j \in [n-2]$ . Then we calculate its moments and use an invariance principle.

Gnedin and Olshanski [Reference Gnedin and Olshanski33] provide a nice construction of the Mallows(q) permutation, which is implicit in the original work [Reference Mallows47]. This representation of the Mallows(q) permutation plays an important role in the proof of Theorem 2.1.

For $n>0$ and $0 < q < 1$ , let $\mathcal{G}_{q,n}$ be a truncated geometric random variable on [n] whose probability distribution is given by

\begin{equation*}\mathbb{P}(\mathcal{G}_{q,n} = k) = \frac{q^{k-1}(1-q)}{1- q^n} \qquad \text{for } k \in [n].\end{equation*}

Since $\mathbb{P}(\mathcal{G}_{q,n} = k)\to n^{-1}$ if $q\to1$ , we can extend the definition of $\mathcal{G}_{q,n}$ to $q=1$ , which is just the uniform distribution on [n]. The Mallows(q) permutation $\pi$ of [n] is constructed as follows. Let $(Y_k; \, k \in [n])$ be a sequence of independent random variables, where $Y_k$ is distributed as $\mathcal{G}_{n+1-k}$ . Set $\pi_1\;:\!=\;Y_1$ and, for $k \ge 2$ , let $\pi_k\;:\!=\;\psi(Y_k)$ where $\psi$ is the increasing bijection from $[n - k +1]$ to $[n] \setminus \{ \pi_1, \pi_2, \ldots, \pi_{k-1} \}$ . That is, pick $\pi_1$ according to $\mathcal{G}_{q,n}$ , and remove $\pi_1$ from [n]. Then pick $\pi_2$ as the $\mathcal{G}_{q,n-1}$ th smallest element of $[n] \setminus \{\pi_1\}$ , and remove $\pi_2$ from $[n] \setminus \{\pi_1\}$ ; and so on. As an immediate consequence of this construction we have that, for the increments $(X_k; \, k \in [n])$ of a random walk generated from the Mallows(q) permutation of $[n+1]$ :

  • for each k, $\mathbb{P}(X_k = 1) = \mathbb{P}(\mathcal{G}_{q,n+1 -k} \leq \mathcal{G}_{q,n-k}) = 1/(1+q)$ , which is independent of k and n; thus, $\mathbb{E}X_k = (1-q)/(1+q)$ and $\textrm{var}\;X_k = 4q/(1+q)^2$ ;

  • the sequence of increments $(X_k; \, k \in [n])$ , though not independent, is two-block factor and hence one-dependent; see [Reference de Valk22] for background.

Such a construction is also used in [Reference Gnedin and Olshanski33] to construct a random permutation of positive integers, called the infinite q-shuffle. The latter is further extended in [Reference Pitman and Tang52] to p-shifted permutations as an instance of regenerative permutations, and used in [Reference Holroyd, Hutchcroft and Levy37] to construction symmetric k-dependent q-coloring of positive integers.

If $\pi$ is a uniform permutation of [n], the central limit theorem of the number of descents $\# \mathcal{D}(\pi)$ is well known:

\begin{equation*} \frac{1}{\sqrt{n}\,}\left(\# \mathcal{D}(\pi) - \frac{n}{2}\right) \stackrel{(\textrm{d})}{\longrightarrow} \frac{1}{\sqrt{12}\,} \mathcal{N}(0,1),\end{equation*}

where $\mathcal{N}(0,1)$ is a standard normal distribution. See See [Reference Chatterjee and Diaconis18, Section 3] for a survey of six different approaches to proving this fact. The central limit theorem of the number of descents of the Mallows(q) permutation is known, and is as follows.

Lemma 3.1. (Proposition 5.2 of [Reference Borodin, Diaconis and Fulman14].) Fix $0< q\leq 1$ , let $\pi$ be the Mallows(q) permutation of [n], and let $\# \mathcal{D}(\pi)$ be the number of descents of $\pi$ . Then

\begin{equation*}\mathbb{E} \# \mathcal{D}(\pi) = \frac{(n-1)q}{1+q} , \qquad \textrm{var} \# \mathcal{D}(\pi) = q \frac{(1-q+q^2)n -1 + 3q - q^2}{(1+q)^2(1+q+q^2)}.\end{equation*}

Moreover,

\begin{equation*} \frac{1}{\sqrt{n}\,}\left(\# \mathcal{D}(\pi) - \frac{nq}{1+q}\right) \stackrel{(\textrm{d})}{\longrightarrow} \mathcal{N}\left(0, \frac{q(1-q+q^2)}{(1+q)^2(1+q+q^2)} \right).\end{equation*}

We are now ready to prove Theorem 2.1.

Proof of Theorem 2.1. We first recall [Reference Billingsley11, Theorem 5.1]: Let $X_1,X_2,\dots$ be an m-dependent sequence, and let $s_n^2=\sum_{i=1}^n\mathbb{E}X_i^2$ . If $\mathbb{E} X_n=0$ for all $n\geq 1$ , if $\lim\sup_{n\to\infty} \mathbb{E} X_n^2<\infty$ , if $|s_n^2 -n\sigma^2| = \textrm{O}(1)$ for some $\sigma^2>0$ , and if

\begin{equation*}\lim_{n\to\infty} s_n^{-2-\delta}\sum_{i=1}^n\mathbb{E} X_i^{2+\delta}=0\end{equation*}

for some $\delta>0$ , then the invariance principle holds for the sequence $X_1,X_2,\dots$ with normalizing factor $\sigma n^{1/2}$ ; that is, the sequence of processes $S_n(t)$ , $0\leq t\leq 1$ , defined by $S_n(k/n)=\sigma^{-1} n^{-1/2}\sum_{i=1}^k X_i$ for $0\leq k\leq n$ and linearly interpolated otherwise, converges weakly to a standard Brownian motion on the unit interval with respect to the Borel sigma-algebra generated by the topology of the supremum norm on the space of continuous functions on the unit interval.

Since the increments of a permutation generated random walk are one-dependent, the functional central limit theorem is an immediate consequence of [Reference Billingsley11, Theorem 5.1] and the moments in Lemma 3.1.

4. Proof of Theorem 2.4

4.1. Stein’s method for the arcsine distribution

It is well known that for a simple symmetric walk, $G_{2n}$ and $N_{2n}$ are discrete arcsine distributed, thus converging to the arcsine distribution. To apply Stein’s method for arcsine approximation we first need a characterizing operator.

Lemma 4.1 A random variable Z is arcsine distributed if and only if

\begin{equation*}\mathbb{E} [ Z(1-Z) f'(Z) + ( 1/2 - Z) f(Z) ] = 0\end{equation*}

for all functions $f \in \textrm{BC}^{2,1}[0,1]$ .

To apply Stein’s method, we proceed as follows. Let Z be an arcsine distributed random variable. Then, for any $h\in \text{Lip}(1)$ or $h \in \textrm{BC}^{2,1}[0,1]$ , assume we have a function f that solves

(4.1) \begin{align}x(1-x) f'(x) + ( 1/2- x ) f(x) = h(x) - \mathbb{E} h(Z).\end{align}

For an arbitrary random variable W, replace x with W in (4.1) and, by taking expectations, this yields an expression for $\mathbb E h(W)-\mathbb E h(Z)$ in terms of just W and f. Our goal is therefore to bound the expectation of the left-hand side of (4.1) by utilizing properties of f. Extending [Reference Döbler24, Reference Goldstein and Reinert34] developed Stein’s method for the beta distribution (of which arcsine is a special case) and gave an explicit Wasserstein bound between the discrete and the continuous arcsine distributions. We will use the framework from [Reference Gan, Röllin and Ross29] to calculate error bounds for the class of test functions $\textrm{BC}^{2,1}$ .

4.2. Proof of Theorem 2.4

To simplify the notation, let $W_n\;:\!=\;N_{2n}/2n$ be the fraction of positive edges of a simple symmetric random walk. Let $\Delta_yf(x)\;:\!=\;f(x+y) - f(x)$ . We will use the following known facts for the discrete arcsine distribution. For any function $f \in \textrm{BC}^{m,1}[0,1]$ ,

(4.2) \begin{equation}\mathbb{E} [ nW_n \left(1 - W_n + \frac{1}{2n} \right) \Delta_{1/n}f \left(W_n - \frac{1}{n} \right) + \left(\frac{1}{2} - W_n \right) f(W_n) ] = 0.\end{equation}

Moreover,

(4.3) \begin{equation}\mathbb{E}W_n = \frac{1}{2} , \qquad \mathbb{E}W_n^2 = \frac{3}{8} + \frac{1}{8n}.\end{equation}

The identity (4.2) can be read from [Reference Döbler24, Lemma 2.9] and [Reference Goldstein and Reinert34, Proof of Theorem 1.1]. The moments are easily derived by plugging in $f(x) = 1$ and $f(x) = x$ .

Proof of Theorem 2.4. The distribution (2.2) of $N_{2n}$ can be found in [Reference Feller28]. The bound (2.3) follows from the fact that $N_{2n}$ is discrete arcsine distributed, together with [Reference Goldstein and Reinert34, Theorem 1.2].

We prove the bound (2.4) using the generator method. Assume $h\in\textrm{BC}^{2,1}([0,1])$ , and recall the Stein equation (4.1) for the arcsine distribution. It follows from [Reference Gan, Röllin and Ross29, Theorem 5] that there exists $g\in\textrm{BC}^{2,1}([0,1])$ such that $x(1-x) g''(x) + ( 1/2- x ) g'(x) = h(x) - \mathbb{E} h(Z)$ , which is just (4.1) with $f = g'$ . We are therefore required to bound the absolute value of

\begin{equation*} \mathbb{E}h(W_n) - \mathbb{E}h(Z)= \mathbb{E}[W_n(1- W_n) g''(W_n) - \left(\frac{1}{2} - W_n \right) g'(W_n)].\end{equation*}

Applying (4.2) with f being replaced by g’, we obtain

\begin{equation*}\begin{split}&\mathbb{E}h(W_n) - \mathbb{E}h(Z) \\&\qquad= \mathbb{E} \left[ W_n(1-W_n) g''(W_n) - n W_n \left(1-W_n + \frac{1}{2n}\right) \Delta_{1/n} g'\left(W_n - \frac{1}{n} \right)\right] \\&\qquad= \mathbb{E}\left[W_n(1-W_n) \left(g''(W_n) - n \Delta_{1/n}g' \left(W_n - \frac{1}{n} \right) \right) - \frac{W_n}{2} \Delta_{1/n} g'\left(W_n - \frac{1}{n} \right)\right].\end{split}\end{equation*}

The second term in the expectation is bounded as

(4.4) \begin{equation}\Bigg\lvert \mathbb{E} \left[\frac{W_n}{2} \Delta_{1/n} g'\left(W_n - \frac{1}{n} \right) \right] \Bigg\lvert \le \frac{\mathbb{E}W_n}{2} \cdot \frac{|g|_2}{n} = \frac{|g|_2}{4n},\end{equation}

and the first term can be bounded as

(4.5) \begin{align}& \Bigg\lvert \mathbb{E} \left[ nW_n(1-W_n) \int_{W_n - \frac{1}{n}}^{W_n} g''(W_n) - g''(x) \, \textrm{d} x \right] \Bigg\lvert \notag \\& \qquad \le \Bigg\lvert \mathbb{E} \left[ nW_n(1-W_n) |g|_{2,1}\int_{W_n - \frac{1}{n}}^{W_n} |W_n -x | dx \right] \Bigg\lvert \notag\\& \qquad = |g|_{2,1} n \mathbb E \left[W_n(1-W_n) \int_0^{\frac1n} s \, \textrm{d} s\right]= \frac{|g|_{2,1}}{16}\left(\frac{1}{n} + \frac{1}{n^2} \right),\end{align}

where the last equality follows from (4.3). Combining (4.4), (4.5), and [Reference Gan, Röllin and Ross29, Theorem 5] (for relating the bounds on derivatives g with derivatives of h) yields the desired bound.

Remark 4.2 The above bound is essentially sharp. Take $h(x) =\frac{x^2}{2}$ , $\mathbb E h(W_n) - \mathbb E h(Z) = -\frac{1}{16n}$ , and the above bound gives $| \mathbb E h(W_n) - \mathbb E h(Z)| \leq \frac{1}{16n} + \frac{1}{64n^2}$ .

5. Proof of Theorem 2.6

In this section we prove Theorem 2.6. To this end, we prove a general result for strong embeddings of a random walk with finitely dependent increments.

5.1. Strong embeddings of m-dependent walks

Let n, m be positive integers. Let $(X_i; \, i\in [n])$ be a sequence of m-dependent random variables. That is, $\{X_1, \ldots, X_j\}$ is independent of $\{X_{j+m+1}, \ldots, X_n\}$ for each $j \in [n-m-1]$ . Let $(S_k; \, k\in \{0,1,\dots, n\})$ be a random walk with increments $X_i$ , and $(S_t; \, 0 \le t \le n)$ be the linear interpolation of $(S_k; \, k\in \{0,1,\dots, n\})$ . Assume that the random variables $X_i$ are centered and scaled such that $\mathbb{E}X_i = 0$ for all $i \in [n]$ and $\textrm{var}(S_n) = n$ . Let $(B_t; \, t \ge 0)$ be a one-dimensional standard Brownian motion. The idea of strong embedding is to couple $(S_t; \, 0 \le t \le n)$ and $(B_t; \, 0 \le t \le n)$ in such a way that

(5.1) \begin{equation}\mathbb{P}\left(\sup_{0 \le t \le n}|S_t - B_t| > b_n \right) = p_n \end{equation}

for some $b_n = o(n^{\frac{1}{2}})$ and $p_n = o(1)$ as $n \rightarrow \infty$ (note that the typical fluctuation of $B_n$ is $O(n^{\frac{1}{2}}$ )).

The study of such embeddings dates back to [Reference Skorokhod57]. When the Xs are independent and identically distributed, [Reference Strassen61] obtained (5.1) with $b_n = \mathcal{O}(n^{\frac{1}{4}} (\log n)^{\frac{1}{2}} (\log \log n)^{\frac{1}{4}})$ ; [Reference Csörgö and Révész20] used a novel approach to prove that under the additional conditions $\mathbb{E}X_i^3 =0$ and $\mathbb{E}X_i^8 < \infty$ we get $b_n = \mathcal{O}(n^{\frac{1}{6} + \varepsilon})$ for any $\varepsilon > 0$ ; and [Reference Komlós, Major and Tusnády41, Reference Komlós, Major and Tusnády42] further obtained $b_n = \mathcal{O}(\log n)$ under a finite moment-generating function assumption. See also [Reference Bhattacharjee and Goldstein10, Reference Chatterjee17] for recent developments.

We use the argument from [Reference Csörgö and Révész20] to obtain the following result for m-dependent random variables.

Theorem 5.1 Let $(S_t; \, 0 \le t \le n)$ be the linear interpolation of partial sums of m-dependent random variables $(X_i; i\in [n])$ . Assume that $1 \le m \le n^{\frac{1}{5}}$ and $\mathbb{E}X_i = 0$ for each $i \in [n]$ . Further assume that $|X_i| \le \beta$ for each $i \in [n]$ , where $\beta > 0$ is a constant. Let

(5.2) \begin{equation}\eta\;:\!=\;\max_{\substack{k \in [n],\\ j \in \{0,\ldots,n-k\}}} |\textrm{var}(S_{j+k} - S_j) - k|.\end{equation}

For any $\varepsilon \in (0,1)$ , if $\eta \le n^{\varepsilon}$ then there exist positive constants $n_0$ , $c_1$ , and $c_2$ depending only on $\varepsilon$ such that, for any $n \ge n_0$ , we can define $(S_t; \, 0 \le t \le n)$ and $(B_t; \, 0 \le t \le n)$ on the same probability space such that

\begin{equation*}\mathbb{P}\left(\sup_{0 \le t \le n}|S_t - B_t| > c_1 n^{\frac{1 + \varepsilon}{4}} (\log n)^{\frac{1}{2}} m^{\frac{1}{2}} \beta \right) \le \frac{c_2 (m^4\beta^6 + \eta)}{m \beta^2 n^{\varepsilon} \log n}.\end{equation*}

If m and $\beta$ are absolute constants and $\textrm{var}(S_{j+k} - S_j)$ matches k up to an absolute constant, from Theorem 5.1 we get (5.1) with $b_n = \mathcal{O}(n^{\frac{1 + \varepsilon}{4}} (\log n)^{\frac{1}{2}})$ and $p_n = \mathcal{O}(1/(n^{\varepsilon} \log n))$ for any fixed $\varepsilon \in (0,1)$ .

Proof of Theorem 2.6. We apply Theorem 5.1 with $m = 1$ , and a suitable choice of $\beta$ and $\eta$ . By centering and scaling, we consider the walk $(S_t^{'}; \, 0 \le t \le n)$ with increments $X_i^{'} = \frac{1}{\sigma} (X_i -\mu)$ . It is easy to see that $|X_i^{'}| \le \frac{1}{\sigma} \max( 1- \mu, 1 + \mu ) = \beta$ . According to the result in Section 3,

\begin{align*} \mathbb{P}(X_k = X_{k+1} = 1) = \mathbb{P}(\mathcal{G}_{q, n+1-k} \le \mathcal{G}_{q,n-k} \le \mathcal{G}_{q,n-k-1}) & = \frac{1}{(1+q)(1+q+q^2)}, \\ \mathbb{P}(X_k = -1, X_{k+1} = 1) = \mathbb{P}(X_{k+1} = 1) - \mathbb{P}(X_k = X_{k+1} = 1) & = \frac{q}{1 + q + q^2}, \\ \mathbb{P}(X_k = X_{k+1} = -1) = \mathbb{P}(X_{k} = -1) - \mathbb{P}(X_k = -1, X_{k+1} = 1) & = \frac{q^3}{(1+q)(1+q+q^2)}.\end{align*}

By the one-dependence property, elementary computation shows that, for $k \le n$ , $\textrm{var} S_k^{'} = k + \eta$ , which leads to the desired result.

5.2. Proof of Theorem 5.1

The proof of Theorem 5.1 boils down to a series of lemmas. In the following, ‘sufficiently large n’ means $n\geq n_0$ for some $n_0$ depending only on $\varepsilon$ . We use C and c to denote positive constants depending only on $\varepsilon$ and may differ in different expressions. Let $d\;:\!=\;\lceil n^{\frac{1-\varepsilon}{2}} \rceil$ , where $\lceil x \rceil$ is the least integer greater than or equal to x. We divide the interval [0,n] into d subintervals by points $\lceil jn/d \rceil$ , $j\in [d]$ , each with length $l=\lceil n/d \rceil$ or $l=\lceil n/d \rceil-1$ . The following results hold for both values of l.

Lemma 5.2 Under the assumptions in Theorem 5.1, we have, for sufficiently large n,

(5.3) \begin{equation}4m\beta^2\geq 1 , \qquad l\geq 6m\log n .\end{equation}

Proof. By the definition of $\eta$ in (5.2), the m-dependence assumption, and the upper bounds on $\eta$ and $|X_i|$ , we have

\begin{equation*}n-n^\varepsilon \leq n-\eta \leq \textrm{var} S_n = \sum_{i = 1}^n \sum_{j: |j - i| \le m} \mathbb{E}X_i X_j \le n(2m+1)\beta^2 , \qquad m \ge 1,\end{equation*}

which implies $4m \beta^2 \ge 1$ for sufficiently large n. The second bound in (5.3) follows from the fact that $m \le n^{\frac{1}{5}}$ and $l \sim n^{\frac{1 + \varepsilon}{2}}$ .

Given two probability measures $\mu$ and $\nu$ on $\mathbb{R}$ , define their Wasserstein-2 distance by

\begin{equation*} d_{W_2}(\mu, \nu) = \left( \inf_{\pi \in \Gamma(\mu, \nu)} \int |x-y|^2 \, \textrm{d} \pi(x,y)\right)^{\frac{1}{2}},\end{equation*}

where $\Gamma(\mu, \nu)$ is the space of all probability measures on $\mathbb{R}^2$ with $\mu$ and $\nu$ as marginals. We will use the following Wasserstein-2 bound from [Reference Fang27]. We use $\mathcal{N}(\mu,\sigma^2)$ to denote the normal distribution with mean $\mu$ and variance $\sigma^2$ .

Lemma 5.3. (Corollary 2.3 of [Reference Fang27].) Let $W=\sum_{i=1}^n \xi_i$ be a sum of m-dependent random variables with $\mathbb{E} \xi_i=0$ and $\mathbb{E} W^2=1$ . We have

(5.4) \begin{equation}d_{W_2}(\mathcal{L}(W), \mathcal{N}(0,1))\leq C_0\left\{m^2 \sum_{i=1}^n \mathbb{E}|\xi_i|^3+m^{3/2} \left(\sum_{i=1}^n \mathbb{E} \xi_i^4\right)^{1/2} \right\},\end{equation}

where $C_0$ is an absolute constant.

Specializing the above lemma to bounded random variables, we obtain the following result.

Lemma 5.4 Under the assumptions in Theorem 5.1, we have, for sufficiently large n,

(5.5) \begin{equation}d_{W_2}(\mathcal{L}(S_{l -m}), \, \mathcal{N}(0,\sigma^2)) \le C m^2 \beta^3,\end{equation}

where $\sigma^2\;:\!=\;\textrm{var} S_{l - m}$ .

Proof. Applying (5.4) to $\sigma^{-1} S_{l-m}$ and using $|X_i|\leq \beta$ , we obtain

\begin{align*}d_{W_2}(\mathcal{L}(S_{l-m}), \, \mathcal{N}(0,\sigma^2)) & = \sigma \, d_{W_2}(\sigma^{-1} S_{l-m}, \mathcal{N}(0,1)) \\& \le \sigma C_0\left( lm^2 \left(\frac{\beta}{\sigma} \right)^3 + \left(lm^3 \left(\frac{\beta}{\sigma} \right)^4 \right)^{\frac{1}{2}} \right) \le Cm^2 \beta^3,\end{align*}

where we used $4 m \beta^2 \ge 1$ from (5.3), and $\sigma^2 \ge l-m-\eta \ge cl$ for sufficiently large n from (5.2), $m \le n^{\frac{1}{5}}$ , $\eta \le n^{\varepsilon}$ , $l \sim n^{\frac{1 + \varepsilon}{2}}$ , and $\varepsilon \in (0,1)$ .

Lemma 5.5 For sufficiently large n, there exists a coupling of $(S_t; \, 0 \le t \le n)$ and $(B_t; \, 0 \le t \le n)$ such that, with $e_j\;:\!=\;(S_{\lceil jn/d \rceil}-S_{\lceil (j-1)n/d \rceil})-(B_{\lceil jn/d \rceil}-B_{\lceil (j-1)n/d \rceil})$ , the sequence $(e_1, \ldots, e_d)$ is one-dependent, and $\mathbb{E}e_j^2 \le C(m^4 \beta^6 + \eta)$ for all $j \in [n]$ .

Proof. We use $4 m \beta^2 \ge 1$ implicitly below to absorb a few terms into $Cm^4 \beta^{6}$ . With $\sigma^2$ defined in Lemma 5.4, we have $d_{W_2}(\mathcal{N}(0,\sigma^2), \, \mathcal{N}(0,l)) \le \sqrt{|l- \sigma^2|} \le \sqrt{m + \eta}$ . Combining (5.5), the above bound, and the m-dependence assumption, we can couple $S_{\lceil jn/d \rceil-m}-S_{\lceil (j-1)n/d \rceil}$ and $B_{\lceil jn/d \rceil}-B_{\lceil (j-1)n/d \rceil}$ for each $j\in [d]$ independently with $\mathbb{E}[(S_{\lceil jn/d \rceil-m}-S_{\lceil (j-1)n/d \rceil})-(B_{\lceil jn/d \rceil}-B_{\lceil (j-1)n/d \rceil})]^2\leq C( m^4 \beta^6 + \eta)$ . By the m-dependence assumption, we can generate $X_1,\dots, X_n$ from their conditional distribution given $(S_{\lceil jn/d \rceil-m}-S_{\lceil (j-1)n/d \rceil}; \, j\in [d])$ , thus obtaining $(S_{t}; \, 0 \le t \le n)$ , and generate $(B_t; 0 \le t \le n)$ given $(B_{\lceil jn/d \rceil}; \, j\in [d])$ . Since $\mathbb{E}(S_{\lceil jn/d \rceil}-S_{\lceil jn/d \rceil-m})^2\leq Cm^2\beta^2$ , we have $\mathbb{E}(e_j^2)\leq C(m^4\beta^6+\eta)$ . Finally, the one-dependence of $(e_1, \ldots, e_d)$ follows from the m-dependence assumption.

Lemma 5.6 Let $T_j=\sum_{i=1}^j e_i$ , $j\in [d]$ . For any $b>0$ and sufficiently large n, $\mathbb{P}(\max_{j\in [d]}|T_j|>b )\leq C(m^4\beta^6+\eta ) d/b^2$ .

Proof. Define $T_j^{(1)}=\sum_{\substack{i=1,3,5,\dots\\ i\leq j}} e_i$ and $T_j^{(2)}=\sum_{\substack{i=2,4,6,\dots\\ i\leq j}} e_i$ . By Lemma 5.5, $T_j^{(1)}$ is a sum of independent random variables with zero mean and finite second moments. By Kolmogorov’s maximal inequality,

\begin{equation*}\mathbb{P}\left(\max_{1\leq j\leq d}\big|T_j^{(1)}\big|>\frac{b}{2} \right) \leq \frac{C(m^4 \beta^6+\eta)d}{b^2}.\end{equation*}

The same bound holds for $T_j^{(2)}$ . The lemma is proved by the union bound

( ) \begin{equation}\mathbb{P}\left(\max_{j\in [d]}|T_j|>b\right)\leq \mathbb{P}\left(\max_{1\leq j\leq d}|T_j^{(1)}|>\frac{b}{2}\right)+\mathbb{P}\left(\max_{1\leq j\leq d}|T_j^{(2)}|>\frac{b}{2}\right). \end{equation}

Lemma 5.7 For any $0<b\leq 4l \beta$ , we have

\begin{equation*}\mathbb{P}\left(\max_{j\in [l]} |S_j- j S_l/l|>b\right)\leq 2l \exp\left(-\frac{b^2}{48 l m \beta^2}\right).\end{equation*}

Proof. We first prove a concentration inequality for $S_j$ , $j \in [l]$ , then use the union bound. Let $h(\theta)=\mathbb{E} \textrm{e}^{\theta S_j}$ , with $h(0)=1$ . Let $S_j^{(i)}=S_j-\sum_{k\in [j]: |k-i|\leq m} X_k$ . Using $\mathbb{E} X_i=0$ , $|X_i|\leq \beta$ , the m-dependence assumption, and the inequality (cf. [Reference Chatterjee16, Eq. (7)])

\begin{equation*} \Bigg\lvert \frac{\textrm{e}^x-\textrm{e}^y}{x-y} \Bigg\lvert\leq \frac{1}{2}(\textrm{e}^x+\textrm{e}^y),\end{equation*}

we have, for $\theta > 0$ and $\theta (2m+1)\beta\leq 1$ ,

\begin{equation*}\begin{split}h'(\theta) & =\mathbb{E} (S_j \textrm{e}^{\theta S_j}) =\sum_{i=1}^j \mathbb{E} X_i (\textrm{e}^{\theta S_j}-\textrm{e}^{\theta S_j^{(i)}}) \leq \frac{\theta}{2} \sum_{i=1}^j \mathbb{E} |X_i| \big|S_j-S_j^{(i)}\big| \left(\textrm{e}^{\theta S_j}+\textrm{e}^{\theta S_j^{(i)}} \right)\\&\leq \left(m+\frac{1}{2}\right) \theta l \beta^2 \mathbb{E} \textrm{e}^{\theta S_j} (1+\textrm{e}^{\theta (2m+1)\beta})\leq 6 \theta l m \beta^2 h(\theta).\end{split}\end{equation*}

This implies that $\log h(\theta)\leq 3l m \beta^2 \theta^2$ , and

\begin{equation*}\mathbb{P}(S_j>b/2)\leq \textrm{e}^{-\theta b/2} \mathbb{E} \textrm{e}^{\theta S_j}\leq \exp\left(-\frac{b^2}{48 l m \beta^2}\right)\end{equation*}

by choosing $\theta=b/(12lm\beta^2)$ , provided that $b\leq 4l\beta$ . The same bound holds for $-S_j$ . Consequently,

( ) \begin{align}\mathbb{P}\left(\max_{j\in [l]} |S_j- j S_l/l|>b \right) & \leq \mathbb{P}\left(\max_{j\in [l-1]} |S_j|>b/2\right)+\mathbb{P}\left(|S_l|> b/2\right) \nonumber\\& \leq 2l \exp\left(-\frac{b^2}{48 l m \beta^2}\right). \end{align}

Lemma 5.8 For any $b>0$ , we have $\mathbb{P}(\sup_{0\leq t\leq l} |B_t-t B_l/l|>b)\leq 2 \textrm{e}^{-\frac{2b^2}{l}}$ .

Proof. We have

\begin{equation*}\begin{split}\mathbb{P}\left(\sup_{0\leq t\leq l} |B_t-t B_l/l|>b \right) & \leq \mathbb{P}\left(\sup_{0\leq t\leq l} (B_t-t B_l/l|)>b \right)+ \mathbb{P}\left(\inf_{0\leq t\leq l} (B_t-t B_l/l)<-b \right) \\& = 2 \mathbb{P} \left(\sup_{0\leq t\leq l} (B_t-t B_l/l)>b\right)= 2\textrm{e}^{-\frac{2b^2}{l}},\end{split}\end{equation*}

where the last equality is the well-known distribution of the maximum of the Brownian bridge (cf. [Reference Shorack and Wellner56, p. 34]).

Now we proceed to proving Theorem 5.1.

Proof of Theorem 5.1. Let $b_l=(96lm\beta^2 \log n)^{1/2}$ and $b:=b_{\lceil \frac{n}{d} \rceil}$ . Note that, since $m\leq l/(6 \log n)$ from (5.3), $b_l$ satisfies the condition $b_l\leq 4l\beta$ in Lemma 5.7 for sufficiently large n. Note also that if $\sup_{0\leq t\leq n}|S_t-B_t|>3b$ then either $\max_{j\in [d]}|T_j|>b$ or the fluctuation of either $S_t$ or $B_t$ within some subinterval of length l is larger than $b_l$ . By the union bound and Lemmas Lemma 5.6–Lemma 5.8, we have

\begin{align*}\begin{split}&\mathbb{P}\left(\sup_{0\leq t\leq n}|S_t-B_t|>3b\right)\\&\qquad \leq \mathbb{P}\left(\max_{j\in [d]} |T_j|>b\right)+d \mathbb{P}\left(\max_{j\in [l]}|S_j-j S_l/l|>b_l\right)+d \mathbb{P} \left(\sup_{0\leq t\leq l} |B_t-t B_l/l|>b_l\right)\\&\qquad\leq \frac{C(m^4 \beta^6+\eta)d}{b^2} + 2dl \exp\left(-\frac{b_l^2}{48l m \beta^2}\right)+2d \exp\left(-\frac{2b_l^2}{l}\right)\\&\qquad\leq \frac{C(m^4 \beta^6+\eta)}{n^\varepsilon m \beta^2 \log n}+\frac{C}{n}\leq \frac{C(m^4\beta^6+\eta)}{m\beta^2 n^{\varepsilon}\log n},\end{split}\end{align*}

where we used $4m\beta^2\geq 1$ for sufficiently large n. This proves the theorem.

Acknowledgements

The initial portion of this work was conducted at the meeting ‘Stein’s method and applications in high-dimensional statistics’ held at the American Institute of Mathematics in August 2018. We are indebted to Bhaswar Bhattacharya, Sourav Chatterjee, Persi Diaconis, and Jon Wellner for helpful discussions throughout the project. We would also like to express our gratitude to John Fry and staff Estelle Basor, Brian Conrey, and Harpreet Kaur at the American Institute of Mathematics for their generosity and excellent hospitality in hosting this meeting at the Fry’s Electronics corporate headquarters in San Jose, CA, and Jay Bartroff, Larry Goldstein, Stanislav Minsker, and Gesine Reinert for organizing such a stimulating meeting.

WT thanks Yuting Ye for communicating the problem of the limiting distribution of the sojourn time of a random walk generated from the uniform permutation, which brought him to this work, and Jim Pitman for helpful discussions. SH acknowledges support from the NSF DMS grant 1501767. XF acknowledges support from Hong Kong RGC ECS 24301617, GRF 14302418, 14304917. We also thank the anonymous referees for their careful reading and for pointing out some errors in the first draft of the manuscript.

Finally, we thank the Institute for Mathematical Sciences at the National University of Singapore, where part of this work was undertaken, for their kind support.

References

Akahori, J. (1995). Some formulae for a new type of path-dependent option. Ann. Appl. Prob. 5, 383388.10.1214/aoap/1177004769CrossRefGoogle Scholar
Andersen, E. S. (1953). On sums of symmetrically dependent random variables. Skand. Aktuarietidskr. 36, 123138.Google Scholar
Arratia, R., Barbour, A. and Tavaré, S. (2003). Logarithmic Combinatorial Structures: A Probabilistic Approach (EMS Monogr. Math. 1). EMS Publishing House, Zurich.Google Scholar
Barlow, M., Pitman, J. and Yor, M. (1989). Une extension multidimensionnelle de la loi de l’arc sinus. In Séminaire de Probabilités (Lect. Notes Math. 23). Springer, Berlin, pp. 294–314.10.1007/BFb0083980CrossRefGoogle Scholar
Bassino, F., Bouvel, M., Féray, V., Gerin, L. and Pierrot, A. (2018). The Brownian limit of separable permutations. Ann. Prob. 46, 21342189.10.1214/17-AOP1223CrossRefGoogle Scholar
Basu, R. and Bhatnagar, N. (2017). Limit theorems for longest monotone subsequences in random Mallows permutations. Ann. Inst. H. Poincaré Prob. Statist. 53, 19341951.10.1214/16-AIHP777CrossRefGoogle Scholar
Bernardi, O., Duplantier, B. and Nadeau, P. (2010). A bijection between well-labelled positive paths and matchings. Séminaire Lotharingien de Combinatoire 63, B63e.Google Scholar
Bertoin, J. and Doney, R. (1997). Spitzer’s condition for random walks and Lévy processes. Ann. Inst. H. Poincaré Prob. Statist. 33, 167178.10.1016/S0246-0203(97)80120-3CrossRefGoogle Scholar
Bertoin, J. (1993). Splitting at the infimum and excursions in half-lines for random walks and Lévy processes. Stoch. Process. Appl. 47, 1735.10.1016/0304-4149(93)90092-ICrossRefGoogle Scholar
Bhattacharjee, C. and Goldstein, L. (2016). On strong embeddings by Stein’s method. Electron. J. Prob. 21, 130.10.1214/16-EJP4299CrossRefGoogle Scholar
Billingsley, P. (1956). The invariance principle for dependent random variables. Trans. Amer. Math. Soc. 83, 250268.10.1090/S0002-9947-1956-0090923-6CrossRefGoogle Scholar
Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. John Wiley, New York.10.1002/9780470316962CrossRefGoogle Scholar
Bingham, N. and Doney, R. (1988). On higher-dimensional analogues of the arc-sine law. J. Appl. Prob. 25, 120131.10.2307/3214239CrossRefGoogle Scholar
Borodin, A., Diaconis, P. and Fulman, J. (2010). On adding a list of numbers (and other one-dependent determinantal processes). Bull. Amer. Math. Soc. 47, 639670.10.1090/S0273-0979-2010-01306-9CrossRefGoogle Scholar
Carlitz, L. (1973). Permutations with prescribed pattern. Math. Nachr. 58, 3153.10.1002/mana.19730580104CrossRefGoogle Scholar
Chatterjee, S. (2007). Stein’s method for concentration inequalities. Prob. Theory Relat. Fields 138, 305321.10.1007/s00440-006-0029-yCrossRefGoogle Scholar
Chatterjee, S. (2012). A new approach to strong embeddings. Prob. Theory Relat. Fields 152, 231264.10.1007/s00440-010-0321-8CrossRefGoogle Scholar
Chatterjee, S. and Diaconis, P. (2017). A central limit theorem for a new statistic on permutations. Indian J. Pure Appl. Math. 48, 561573.10.1007/s13226-017-0246-3CrossRefGoogle Scholar
Chung, K. L. and Feller, W. (1949). On fluctuations in coin-tossing. Proc. Nat. Acad. Sci. 35, 605608.10.1073/pnas.35.10.605CrossRefGoogle Scholar
Csörgö, M. and Révész, P. (1975). A new method to prove Strassen type laws of invariance principle. I. Z. Wahrscheinlichkeitsth. 31, 255259.10.1007/BF00532865CrossRefGoogle Scholar
de Bruijn, N. G. (1970). Permutations with given ups and downs. Nieuw Arch. Wiskd. 18, 6165.Google Scholar
de Valk, V. (1994). One-Dependent Processes: Two-Block-Factors and Non-Two-Block-Factors (CWI tracts 85). Centrum voor Wiskunde en Informatica, Amsterdam.Google Scholar
Diaconis, P. (1988). Group Representations in Probability and Statistics (Lect. Notes Monogr. 11). Institute of Mathematics and Statistics, Hayward, CA.Google Scholar
Döbler, C. (2012). A rate of convergence for the arcsine law by Stein’s method. Preprint, arXiv:1207.2401.Google Scholar
Dynkin, E. B. (1965). Markov Processes Vols. I, II (Grundlehren der Mathematischen Wissenschaften 121, 122). Springer, Berlin.Google Scholar
Erdös, P. and Kac, M. (1947). On the number of positive sums of independent random variables. Bull. Amer. Math. Soc. 53, 10111020.10.1090/S0002-9904-1947-08928-XCrossRefGoogle Scholar
Fang, X. (2019). Wasserstein-2 bounds in normal approximation under local dependence. Electron. J. Prob. 24, 114.10.1214/19-EJP301CrossRefGoogle Scholar
Feller, W. (1968). An Introduction to Probability Theory and Its Applications, 2nd ed. Vol. I. John Wiley, New York.Google Scholar
Gan, H. L., Röllin, A. and Ross, N. (2017). Dirichlet approximation of equilibrium distributions in Cannings models with mutation. Adv. Appl. Prob. 49, 927959.10.1017/apr.2017.27CrossRefGoogle Scholar
Getoor, R. and Sharpe, M. (1994). On the arc-sine laws for Lévy processes. J. Appl. Prob. 31, 7689.10.2307/3215236CrossRefGoogle Scholar
Gladkich, A. and Peled, R. (2018). On the cycle structure of Mallows permutations. Ann. Prob. 46, 11141169.10.1214/17-AOP1202CrossRefGoogle Scholar
Gnedin, A. and Olshanski, G. (2006). Coherent permutations with descent statistic and the boundary problem for the graph of zigzag diagrams. Int. Math. Res. Not. 2006, 51968.Google Scholar
Gnedin, A. and Olshanski, G. (2010). q-exchangeability via quasi-invariance. Ann. Prob. 38, 21032135.10.1214/10-AOP536CrossRefGoogle Scholar
Goldstein, L. and Reinert, G. (2013). Stein’s method for the beta distribution and the Polya–Eggenberger urn. J. Appl. Prob. 50, 11871205.10.1017/S0021900200013875CrossRefGoogle Scholar
Hoffman, C., Rizzolo, D. and Slivken, E. (2017a). Pattern-avoiding permutations and Brownian excursion part I: Shapes and fluctuations. Random Structures Algorithms 50, 394419.10.1002/rsa.20677CrossRefGoogle Scholar
Hoffman, C., Rizzolo, D. and Slivken, E. (2017b). Pattern-avoiding permutations and Brownian excursion, part II: Fixed points. Prob. Theory Relat. Fields 169, 377424.10.1007/s00440-016-0732-2CrossRefGoogle Scholar
Holroyd, A., Hutchcroft, T. and Levy, A. (2020). Mallows permutations and finite dependence. Ann. Prob. 48, 343379.CrossRefGoogle Scholar
Janson, S. (2017). Patterns in random permutations avoiding the pattern 132. Combinatorics Prob. Comput. 26, 2451.10.1017/S0963548316000171CrossRefGoogle Scholar
Karatzas, I. and Shreve, S. E. (1987). A decomposition of the Brownian path. Statist. Prob. Lett. 5, 8793.10.1016/0167-7152(87)90061-7CrossRefGoogle Scholar
Kasahara, Y. and Yano, Y. (2005). On a generalized arc-sine law for one-dimensional diffusion processes. Osaka J. Math. 42, 110.Google Scholar
Komlós, J., Major, P. and Tusnády, G. (1975). An approximation of partial sums of independent RVs, and the sample DF. I. Z. Wahrscheinlichkeitsth. 32, 111131.10.1007/BF00533093CrossRefGoogle Scholar
Komlós, J., Major, P. and Tusnády, G. (1976). An approximation of partial sums of independent RVs, and the sample DF. II. Z. Wahrscheinlichkeitsth. 34, 3358.10.1007/BF00532688CrossRefGoogle Scholar
Lévy, P. (1939). Sur certains processus stochastiques homogènes. Compositio Math. 7, 283339.Google Scholar
Lévy, P. (1965). Processus stochastiques et mouvement brownien. Suivi d’une note de M. Loève. Deuxième édition revue et augmentée. Gauthier-Villars & Cie, Paris.Google Scholar
McDonald, J. H. (2009). Handbook of Biological Statistics. Sparky House Publishing, Baltimore, MD.Google Scholar
MacMahon, P. A. (1960). Combinatory Analysis. Chelsea Publishing Co., New York.Google Scholar
Mallows, C. L. (1957). Non-null ranking models. I. Biometrika 44, 114130.CrossRefGoogle Scholar
Niven, I. (1968). A combinatorial problem of finite sequences. Nieuw Arch. Wisk 16, 116123.Google Scholar
Oshanin, G. and Voituriez, R. (2004). Random walk generated by random permutations of $\{1, 2, 3, \ldots, n+1\}$ . J. Phys. A 37, 6221.10.1088/0305-4470/37/24/002CrossRefGoogle Scholar
Pitman, J. (2006). Combinatorial Stochastic Processes (Lect. Notes Math. 1875). Springer, Berlin.Google Scholar
Pitman, J. (2018). Random weighted averages, partition structures and generalized arcsine laws. Preprint, arXiv:1804.07896.Google Scholar
Pitman, J. and Tang, W. (2019). Regenerative random permutations of integers. Ann. Prob. 47, 13781416.10.1214/18-AOP1286CrossRefGoogle Scholar
Pitman, J. and Yor, M. (1992). Arcsine laws and interval partitions derived from a stable subordinator. Proc. London Math. Soc. 65, 326356.10.1112/plms/s3-65.2.326CrossRefGoogle Scholar
Rogers, L. C. G. and Williams, D. (1987). Diffusions, Markov Processes, and Martingales, Vol. 2. John Wiley, New York.Google Scholar
Salari, K., Tibshirani, R. and Pollack, J. R. (2010). DR-Integrator: A new analytic tool for integrating DNA copy number and gene expression data. Bioinformatics 26, 414416.CrossRefGoogle ScholarPubMed
Shorack, G. R. and Wellner, J. A. (1986). Empirical Processes with Applications to Statistics. John Wiley, New York.Google Scholar
Skorokhod, A. V. (1965). Studies in the Theory of Random Processes. Addison-Wesley Publishing Co., Inc., Reading, MA.Google Scholar
Stanley, R. (1976). Binomial posets, Möbius inversion, and permutation enumeration. J. Combinatorial Theory A 20, 336356.10.1016/0097-3165(76)90028-5CrossRefGoogle Scholar
Stanley, R. (1999). Enumerative Combinatorics, Vol. 2 (Camb. Studies Adv. Math. 62). Cambridge University Press.CrossRefGoogle Scholar
Starr, S. (2009). Thermodynamic limit for the Mallows model on ${S}_n$ . J. Math. Phys. 50, 095208.10.1063/1.3156746CrossRefGoogle Scholar
Strassen, V. (1967). Almost sure behavior of sums of independent random variables and martingales. In Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. 2.Google Scholar
Takács, L. (1996). On a generalization of the arc-sine law. Ann. Appl. Prob. 6, 10351040.Google Scholar
Tang, W. (2019). Mallows ranking models: Maximum likelihood estimate and regeneration. Proc. Mach. Learn. Res. 97, 61256134.Google Scholar
Tarrago, P. (2018). Zigzag diagrams and Martin boundary. Ann. Prob. 46, 25622620.10.1214/17-AOP1234CrossRefGoogle Scholar
Viennot, G. (1979). Permutations ayant une forme donnée. Discrete Math. 26, 279284.10.1016/0012-365X(79)90035-9CrossRefGoogle Scholar
Wang, R., Waterman, M. and Huang, H. (2014). Gene coexpression measures in large heterogeneous samples using count statistics. Proc. Nat. Acad. Sci. 111, 1637116376.10.1073/pnas.1417128111CrossRefGoogle Scholar
Wang, R., Liu, K., Theusch, E., Rotter, J., Medina, M., Waterman, M. and Huang, H. (2017). Generalized correlation measure using count statistics for gene expression data with ordered samples. Bioinformatics 34, 617624.CrossRefGoogle Scholar
Watanabe, S. (1995). Generalized arc-sine laws for one-dimensional diffusion processes and random walks. In Proc. Symp. Pure Math., Vol. 57, pp. 157172. American Mathematical Society, Providence, RI.10.1090/pspum/057/1335470CrossRefGoogle Scholar
Williams, D. (1969). Markov properties of Brownian local time. Bull. Amer. Math. Soc. 75, 10351036.CrossRefGoogle Scholar