1. Introduction
Let $\Gamma \backslash \mathbb {H}$ be a finite volume hyperbolic surface. A basic problem in quantum chaos is to understand the limiting behavior of $L^2$ normalized Laplace eigenfunctions $\varphi $ on $\Gamma \backslash \mathbb {H}$ . This behavior can be quantified through weak limits of $L^2$ masses (‘quantum ergodicity’), bounds for $L^p$ norms and so forth. We consider in this paper the supnorm problem, which consists of bounding the supremum or $L^\infty $ norm of an $L^2$ normalized eigenfunction $\varphi $ with respect to the eigenvalue $\lambda _\varphi $ and/or the geometry of the underlying manifold $\Gamma \backslash \mathbb {H}$ . A general bound in this direction, due to Bérard [Reference BérardBér77], asserts that
Here and henceforth, $A \ll B$ means that there is a constant C such that $A \le CB$ ; we allow C to depend on any subscripts of $\ll $ and write $\varepsilon $ for an arbitrary, but sufficiently small, positive constant, which may change from line to line.
Stronger bounds have been established in the arithmetic case that

∘ $\Gamma \backslash \mathbb {H}$ is an arithmetic manifold, such as the modular surface $\operatorname {\mathrm {SL}}_2(\mathbb {Z}) \backslash \mathbb {H}$ or a congruence cover, and

∘ $\varphi $ is a Hecke–Maaß form, that is, an eigenfunction not only of the Laplacian but also of the Hecke operators.
The pioneering result in that case is due to Iwaniec–Sarnak [Reference Iwaniec and SarnakIS95], who showed for congruence lattices $\Gamma $ that
The above estimates depend in an unspecified manner upon the underlying manifold. Consider, for instance, the case that $\Gamma $ is the Hecke congruence subgroup $\Gamma _0(N) = \operatorname {\mathrm {SL}}_2(\mathbb {Z}) \cap \left ( \begin {smallmatrix} \mathbb {Z} &\mathbb {Z} \\ N \mathbb {Z} & \mathbb {Z} \end {smallmatrix} \right )$ so that $\Gamma \backslash \mathbb {H}$ is an arithmetic manifold of volume $N^{1+o(1)}$ . We suppose that N is squarefree. A direct quantification of the Iwaniec–Sarnak argument (see [Reference Blomer and HolowinskyBH10, §10]) gives the estimate
where we normalize $\varphi $ to have $L^2$ norm one with respect to the hyperbolic probability measure, that is, the multiple of the hyperbolic measure having total volume one. The level aspect case of the supnorm problem is to improve the dependence of the bound (1.3) upon N. The first improvement in the exponent was a major breakthrough of Blomer–Holowinsky [Reference Blomer and HolowinskyBH10], achieved 13 years after the work of Iwaniec–Sarnak. For a Hecke–Maaß newform $\varphi $ of eigenvalue $\lambda _{\varphi }$ , they managed to show
(with explicit polynomial dependence upon $\lambda _\varphi $ ). Subsequently, Templier [Reference TemplierTem10] and Harcos–Templier [Reference Harcos and TemplierHT12, Reference Harcos and TemplierHT13] established several improved bounds, culminating in
The estimate (1.5) is comparable in strength to the Weyl bound for the Riemann zeta function and has long been regarded as a natural limit for the supnorm problem in the squarefree level aspect [Reference Harcos and TemplierHT13, Remarks (i)]. It has been extended to number fields [Reference Blomer, Harcos and MilićevićBHM16, Reference Blomer, Harcos, Maga and MilićevićBHMM20, Reference AssingAss24] and to more general vectors than newforms [Reference Hu, Nelson and SahaHNS19, Reference AssingAss21]. For levels that are not squarefree (e.g., powers of a fixed prime), the flavor of the problem is quite different (see Remark 1.4), and stronger estimates have been achieved in [Reference SahaSah17, Reference MarshallMar16, Reference SahaSah20, Reference ComtatCom21, Reference Hu and SahaHS20].
In this work, we bring new methodology to bear on the supnorm problem in the squarefree level aspect. By obtaining optimal solutions to the technical problems that arise in applying that methodology, we deduce the following improvement of Equation (1.5).
Theorem 1.1. Let N be a squarefree natural number. Let $\varphi $ be a cuspidal Hecke–Maaß newform for $\Gamma _0(N)$ with trivial (central) character. Suppose that $\varphi $ is $L^2$ normalized with respect to the hyperbolic probability measure on $\Gamma _0(N) \backslash \mathbb {H}$ . Then
Our main results apply not only to $\Gamma _0(N) \backslash \mathbb {H}$ but also to compact arithmetic quotients. In general, such a manifold is of the shape $\Gamma \backslash \mathbb {H}$ , where $\Gamma $ is commensurable with a lattice attached to a maximal order in a quaternion algebra B over a totally real field F, with B split at exactly one Archimedean place. We are content here to consider the case $F = \mathbb {Q}$ so that B is an indefinite quaternion algebra, characterized up to isomorphism by its reduced discriminant $d_B$ . For each natural number N coprime to $d_B$ , we denote by $\Gamma _0^B(N)$ the group of proper (i.e., norm one) units arising from an Eichler order of level N in B (see Section 2.1 for details). For example, if $B = \operatorname {\mathrm {Mat}}_{2 \times 2}(\mathbb {Q})$ , then we could take $\Gamma _0^B(N) = \Gamma _0(N)$ . We prove the following theorem.
Theorem 1.2. Let $\Gamma =\Gamma ^B_0(N)$ be as above with the level N being squarefree. Let $\varphi $ be a cuspidal Hecke–Maaß newform for $\Gamma $ with trivial (central) character, $L^2$ normalized with respect to the hyperbolic probability measure on $\Gamma \backslash \mathbb {H}$ . Then, with $V = (d_BN)^{1+o(1)}$ the covolume of $\Gamma $ ,
Theorem 1.2 specializes to Theorem 1.1 upon taking $B = \operatorname {\mathrm {Mat}}_{2 \times 2}(\mathbb {Q})$ . It improves upon (the $F = \mathbb {Q}$ case of) Templier’s result [Reference TemplierTem10], which gave the nontrivial bound $V^{\frac {1}{2}  \frac {1}{24} + \varepsilon }$ . We emphasize that the estimate (1.6) is uniform in the quaternion algebra B, hence gives a strong saving in the ‘discriminant aspect’; the first nontrivial results in that aspect (for B indefinite, as we have assumed) were established only very recently by Toma [Reference TomaTom23], updating an earlier preprint, giving (among other things) the bound $V^{\frac {1}{2}\frac {1}{30}+\varepsilon }$ . Our method applies equally in the setting of definite quaternion algebras, where we improve the exponent $\frac {1}{3}$ of Blomer–Michel [Reference Blomer and MichelBM11, Reference Blomer and MichelBM13] down to $\frac {1}{4}$ in analogy with Theorem 1.2 (see Section §2.3 for details).
Remark 1.3. The dependence on the eigenvalue in Equation (1.6) that follows from our proof is of exponential nature. With some finer Archimedean considerations, it seems likely that one could show $\\varphi \_{\infty } \ll _{\varepsilon } \lambda _\varphi ^{\frac {1}{4} +\varepsilon } V ^{\frac {1}{4} + \varepsilon }$ ; indeed, by comparison, we obtain such an estimate for the definite analogue of Equation (1.6) (see Corollary 2.3). Such a refinement of Equation (1.6) seems to require lengthy Archimedean calculations that we feel would distract from the primary novelties of this paper concerning the level aspect.
Remark 1.4. We have noted already that we focus in this paper on the case of squarefree levels. The opposite case is the depth aspect, where the level is a power $N = p^n$ of a fixed prime p. In that case, local arguments give the bound $\\varphi \_\infty \ll _{p,d_B, \varepsilon } (\lambda _\varphi N)^{1/4+\varepsilon }$ [Reference MarshallMar16], which has been improved to $\\varphi \_\infty \ll _{\lambda _\varphi ,p,d_B, \varepsilon } N^{5/24+\varepsilon }$ [Reference Hu and SahaHS20] via arithmetic amplification and refined local analysis.
Remark 1.5. In a function field setting analogous to that of Theorem 1.1, Sawin [Reference SawinSaw21] has used geometric techniques to establish (among other things) the supnorm bound $\ll N^{\frac {1}{4} + \alpha _q}$ , where $\alpha _q> 0$ tends to zero as the cardinality q of the underlying finite field tends to $\infty $ . We do not see any obstruction to adapting the techniques of this paper to the function field setting, where we expect they would give the improved bound $\ll _{\varepsilon } N^{\frac {1}{4} + \varepsilon }$ .
By combining the arguments of this paper with those of the prequel [Reference Khayutin and SteinerKS20] concerning the weight aspect for holomorphic forms, we obtain the following uniform hybrid bound in the weight and level aspects.
Theorem 1.6. Let $\Gamma = \Gamma ^B_0(N)$ be as in Theorem 1.2. Let f be a cuspidal holomorphic newform for $\Gamma $ with trivial (central) character and weight $k\ge 2$ . Suppose f is $L^2$ normalized with respect to the hyperbolic probability measure on $\Gamma \backslash \mathbb {H}$ . Then
where $V = (d_BN)^{1+o(1)}$ denotes the covolume of $\Gamma $ .
1.1. Selected applications
A straightforward application of these improved supnorms is to $L^p$ norms for $2\le p \le \infty $ by means of interpolation. We state here only the split holomorphic case, as in this case, strong $L^4$ bounds were given by Buttcane–Khan [Reference Buttcane and KhanBK15] with subconvexity input from [Reference YoungYou17].
Corollary 1.7. Let q denote an odd prime and f a cuspidal holomorphic newform for $\Gamma _0(q)$ with trivial (central) character and weight k. Suppose f is $L^2$ normalized with respect to the hyperbolic probability measure on $\Gamma _0(q) \backslash \mathbb {H}$ . Then, for $2 \le p \le \infty $ and any $\eta>0$ , we have
for k sufficiently large in terms of $\eta $ .
Further applications of supnorm bounds include shifted convolution problems and subconvexity results for Lfunctions; see, for example, [Reference HarcosHar03, Reference Harcos and MichelHM06, Reference Hou and ChenHC19, Reference Hu and SahaHS20, Reference NordentoftNor21]. Often, such applications would be obtained from a uniform version of Wilton’s estimate. By applying the arguments of [Reference Harcos and MichelHM06, §2.7] with our improved supnorm bound, we derive the following corollary.
Corollary 1.8. Let $\lambda (m)$ , $m \in \mathbb {N}$ , denote the Hecke eigenvalues, normalized so that the Ramanujan conjecture reads $\lambda (m) \ll _\varepsilon m^{\varepsilon }$ , of either a cuspidal Hecke–Maaß newform or a cuspidal holomorphic newform of weight k on $\Gamma _0(N)$ with trivial (central) character, where N is squarefree. Then, for any $\alpha \in \mathbb {R}$ , one has
where the implied constant in the Maaß case further depends on the eigenvalue of the form.
As a consequence, we may, for example, improve the main theorem in [Reference Hou and ChenHC19].
Corollary 1.9. Let $\varphi $ either be a cuspidal Hecke–Maaß newform or a cuspidal holomorphic newform on $\Gamma _0(q)$ , with q prime. Let $\chi $ be a primitive Dirichlet character of modulus m with $(m,q)=1$ . Suppose that $q = m^{\eta }$ with $0 < \eta < 2$ . Then, we have
where the implied constant depends on the eigenvalue respectively weight of $\varphi $ , $\mathcal {C}=qm^2$ is the conductor of the Lfunction and $\vartheta $ is the current best bound towards the generalized Ramanujan conjecture if $\varphi $ is a Maaß form and $0$ if $\varphi $ is holomorphic.
1.2. The fourth moment and further applications
The method underlying most previous works on this problem, including the work of Harcos–Templier giving the bound $\ll _{\epsilon } N^{1/3+\epsilon }$ , is based on the amplification method introduced in the original paper of Iwaniec–Sarnak. Recently, Steiner [Reference SteinerSte20] and Khayutin–Steiner [Reference Khayutin and SteinerKS20] introduced a new method based on analysis of fourth moments over families. The key observation of these papers was that such a fourth moment naturally arises as the $L^2$ norm of a theta kernel. Alternatively, Blomer et al. [Reference Blomer, Harcos, Maga and MilićevićBHMM22] have demonstrated that one may use Voronoï summation for Rankin–Selberg convolutions in place of a theta kernel. Prior to the application to fourth moments, theta kernels have played similar roles in the study of quantum variance [Reference NelsonNel16, Reference NelsonNel17, Reference NelsonNel19, Reference NelsonNel20], numerical computations [Reference NelsonNel15] and in the proof of Waldspurger’s formula [Reference WaldspurgerWal85]. In each of these earlier works, theta kernels apparently served as a substitute for parabolic Fourier expansions, giving a tool for establishing analogues on compact quotients (where such expansions are not available) of results known already for noncompact quotients. The present work differs in that our main result is new even for the noncompact quotients $\Gamma _0(N) \backslash \mathbb {H}$ .
In this paper, we follow generally the theta kernel strategy of the prequel [Reference Khayutin and SteinerKS20] and prove a fourth moment bound from which one may deduce the Theorems 1.1, 1.2 and 1.6 after some additional analysis near any cusps. In what follows, we let $\Gamma = \Gamma ^B_0(N)$ be a lattice as in Theorem 1.2 and denote by $V = (d_B N)^{1+o(1)}$ the volume of $\Gamma \backslash \mathbb {H}$ .
The formulation of our results requires some quantification of the closeness of a point $z \in \Gamma \backslash \mathbb {H}$ to the cusps. If $\Gamma \backslash \mathbb {H}$ is noncompact (i.e., $d_B = 1$ ), then we may assume that $\Gamma = \Gamma _0(N)$ , and we set
where $A_0(N)$ denotes the lattice of Atkin–Lehner operators for $\Gamma _0(N)$ (see Section §2.2 for another formulation of the definition of H). If $\Gamma \backslash \mathbb {H}$ is compact, then we set $H(z) = 0$ .
Theorem 1.10. Let $\Gamma = \Gamma ^B_0(N)$ be as in Theorem 1.2. Fix $\Lambda> 0$ , and let $(\varphi _i)_i$ be an orthonormal set of cuspidal Hecke–Maaß newforms with trivial (central) character and Laplaceeigenvalue bounded by $\Lambda $ on the hyperbolic surface $\Gamma \backslash \mathbb {H}$ equipped with the hyperbolic probability measure. Then, for any two points $z,w \in \Gamma \backslash \mathbb {H}$ , we have
Similarly, for an orthonormal set $(f_i)_i$ of cuspidal holomorphic newforms for $\Gamma $ of weight k and trivial (central) character with respect to the hyperbolic probability measure on $\Gamma \backslash \mathbb {H}$ , we have
for any two points $z,w \in \Gamma \backslash \mathbb {H}$ .
In the case that the hyperbolic surface $\Gamma \backslash \mathbb {H}$ is compact, we may integrate z and w over the whole surface and get an essentially sharp bound on the fourth moment of fourth norms in the level aspect, thereby extending a result of Blomer [Reference BlomerBlo13] to the case of cocompact lattices $\Gamma $ .
Corollary 1.11. With notation and assumptions as in Theorem 1.10 and assuming further that $\Gamma \backslash \mathbb {H}$ is compact, we have
This result may also be recast as a double average of triple Lfunctions by means of Watson’s formula [Reference WatsonWat08, Theorem 3].
The final application of Theorem 1.10 we mention is to the diameter of compact arithmetic hyperbolic surfaces $\Gamma \backslash \mathbb {H}$ [Reference SteinerSte23]. Here, one may use the sharp bound on the ‘fourth moment’ of exceptional eigenforms, together with a strong density estimate for the exceptional eigenvalues, to get an optimal estimate on the almost diameter and an estimate on the diameter of the same strength as if one were to assume the Selberg eigenvalue conjecture.
1.3. The added complexity of the level aspect
Compared to the weight aspect treated in the prequel, the level aspect requires many new ideas. Here, we tacitly restrict to the case of squarefree level; the general case would require a more nuanced discussion. In some sense, the level aspect may be understood as intermediate in difficulty between the holomorphic and eigenvalue aspects. Indeed, relative to known techniques, the difficulty in the supnorm problem is reflected in the essential support of the matrix coefficient of the automorphic form being bounded. In the weight, (squarefree) level and eigenvalue aspects, the matrix coefficient concentrates on a space of dimension one, two and three, respectively.
We now briefly recall the main idea of the theta approach and discuss some of the new challenges that arise in the level aspect. We focus first on the case of Hecke–Maaß forms on $\Gamma _0(N) \backslash \mathbb {H}$ , as in Theorem 1.1. Take $R=\left (\begin {smallmatrix} \mathbb {Z} & \mathbb {Z} \\ N \mathbb {Z} & \mathbb {Z} \end {smallmatrix}\right )$ so that that the set of proper units of R is precisely $\Gamma _0(N)$ . For $\ell \mid N$ , let $R(\ell )= \left (\begin {smallmatrix} \mathbb {Z} & \mathbb {Z} / \ell \\ N \mathbb {Z} / \ell & \mathbb {Z} \end {smallmatrix}\right )$ denote the partially dualized lattices of the order R. Let $\sigma _z \in \operatorname {\mathrm {SL}}_2(\mathbb {R})$ be any matrix taking i to $z \in \mathbb {H}$ . Let $\varphi $ be an arithmetically normalized cuspidal Hecke–Maaß newform. The theta identity at the heart of the argument then reads
where V denotes the covolume of $\Gamma _0(N)$ and the theta function is given by
By Bessel’s inequality, the lefthand side of Equation (1.7) is in essence captured by the $L^2$ norm of the difference of the theta kernels $\theta (z,z;\cdot )\theta (w,w;\cdot )$ . From here, one may then proceed as in the prequel by covering a fundamental domain by Siegel sets and making use of the orthogonality relations in the unipotent direction. One ends up with a weighted sum over matrices $\gamma _1, \gamma _2 \in R(\ell )$ satisfying $\det (\gamma _1)=\det (\gamma _2)$ and for which the entries of $\sigma _z^{1} \gamma _{i} \sigma _z$ , $i=1,2$ , satisfy certain bounds (and similarly for w). The bounds imposed on these entries depend crucially upon the precise choice of Siegel domains, so it is important that we make a good choice. Like in the prequel, we split the count according to whether $\operatorname {\mathrm {tr}}(\gamma _1) = \operatorname {\mathrm {tr}}(\gamma _2)$ or not.
In the case of nonequal trace, the naïve choice of Siegel domains consisting of $\Gamma _0(N) \backslash \operatorname {\mathrm {SL}}_2(\mathbb {Z})$ translates of the standard Siegel domain for $\operatorname {\mathrm {SL}}_2(\mathbb {Z})$ leads to a rather challenging counting problem. In order to get a sharp bound on Equation (1.7), one faces the challenge of counting, for each divisor $\ell $ of N and each T with $\ell ^{1/2} \ll T \ll 1$ , the sextuples of integers $(a_1,b_1,c_1,a_2,b_2,c_2)$ satisfying
We would need to know that the number of such sextuples is roughly $O(\ell T^2)$ in the range ${N^{1} \ll y \ll N^{1/2}}$ and $x \le \frac {1}{2}$ . We do not know how to establish such a bound directly when, for instance, $\ell = N$ . On the other hand, when $\ell = 1$ , the congruence condition is void and, using arguments of Harcos–Templier, we can prove the required bound with some room to spare, namely, for T up to $N^{1/2}$ . Our solution to this dichotomy is thus to decrease the size of the Siegel domains associated to larger $\ell $ at the expense of increasing those associated to smaller $\ell $ . This solution may be implemented most simply by applying an Atkin–Lehner involution to the covering of $\Gamma _0(N) \backslash \mathbb {H}$ by $\operatorname {\mathrm {SL}}_2(\mathbb {Z})$ translates of the standard fundamental domain for $\operatorname {\mathrm {SL}}_2(\mathbb {Z})$ . With this maneuver, we reduce to considering the range $T \ll N^{\frac {1}{2}} \ell ^{1}$ . We are then able to prove the required bound by forgoing the congruence condition, reducing the problem to counting triples of integers $(a_i,b_i,c_i)$ satisfying Equation (1.10), which we carry out using geometry of numbers techniques. We refer subsequently to this type of counting problem, where we count traceless matrices $\gamma \in R(\ell )^0$ with a bound on the entries of $\sigma _{z}^{1}\gamma \sigma _z$ , as ‘Type I’.
In the case of equal trace, we need to count sextuples of integers $(a_1,b_1,c_1,a_2,b_2,c_2)$ satisfying Equation (1.10) and
We need to bound this count by $O(\ell T)$ in the same ranges as before. We refer to this type of counting problem as ‘Type II’. The key observation is that $(a_1,b_1,c_1)$ turns out to determine $(a_2,b_2,c_2)$ up to a small number of possibilities. This allows us to reduce Type II estimates to Type I estimates.
The above arguments suffice for noncompact quotients, that is, for the proof of Theorem 1.1. They rely on the use of matrix coordinates $\left (\begin {smallmatrix} a & b \\ c & d \end {smallmatrix}\right )$ with respect to which the lattices $\Gamma _0(N)$ are described by the simple congruence condition $c \equiv 0\ \ \pod {N}$ . We were unable to find an analogously straightforward way to separate the variables in the compact setting (e.g., using fixed quadratic subalgebras of B). In the case that B is definite, the Type I counts were treated in a coordinatefree way by Blomer–Michel [Reference Blomer and MichelBM11, Reference Blomer and MichelBM13], who controlled the successive minima of the ternary quadratic lattice underlying $\Gamma _0^B(N)$ in terms of only the content, level and discriminant of that lattice. We extend their arguments to the case that B is indefinite by defining analogous Archimedean quantities that control the disparity of the reduced norm and a majorant, such as the square of the Frobenius norm of $\sigma _z^{1} \gamma \sigma _z$ for $\gamma \in R(\ell )^0$ .
Following the same strategy as in the noncompact case, it remains then only to reduce Type II estimates to Type I estimates. This reduction is perhaps the most subtle part of our counting arguments. It requires us to establish the analogue in the compact setting of the key observation noted following Equation (1.12). For example, in case that B is definite, writing R for an Eichler order of level N, we need to show that for each $n \ll V$ , the number of elements $\gamma \in R$ with trace $0$ and norm n is essentially $O(1)$ , uniformly in N and B. We eventually managed to do so through a delicate argument involving commutators and representations of binary quadratic forms.
1.4. Organization of the paper
The complete statements of our results may be found in Section §2. In Section §3, we reduce the proofs to those of two auxiliary collections of results:

∘ those concerning matrix counting, and

∘ those reducing the required estimates for theta functions to matrix counting.
The latter, including the appropriate splicing of a fundamental domain into Siegel sets, may be found in Section §4. In Section §5, we summarize the required properties of the theta functions. The proofs of said properties are deferred to Appendix A.
Sections §7 and §8 are dedicated to the anisotropic extension of the lattice counting argument of Blomer–Michel, which we subsequently apply to the Type I counting problem in Section §9.
The final section, §10, treats the crucial Type II counting problem.
2. Statement of results
2.1. Setup
Let B be a quaternion algebra over $\mathbb {Q}$ . We denote by $d_B$ its reduced discriminant, or equivalently, the product of the primes at which B ramifies. We write G for the linear algebraic group over $\mathbb {Q}$ given by for any $\mathbb {Q}$ algebra L. Then G is an inner form of $\operatorname {PGL}_2$ , and all rational forms of $\operatorname {PGL}_2$ arise in this way. Denote by $[G]$ the adelic quotient $G(\mathbb {Q}) \backslash G(\mathbb {A})$ . We fix the probability Haar measure on $[G]$ . Let $K_\infty $ be a compact maximal torus of $G(\mathbb {R})$ . We assume that $K_\infty $ comes equipped with a choice of isomorphism $\kappa : \mathbb {R} / \pi \mathbb {Z} \xrightarrow {\sim } K_\infty $ . In the split case $B=\operatorname {\mathrm {Mat}}_{2 \times 2}(\mathbb {Q})$ , we identify $G=\operatorname {PGL}_2$ and set $\kappa (\theta )=\left (\begin {smallmatrix} \cos (\theta ) & \sin (\theta ) \\ \sin (\theta ) & \cos (\theta ) \end {smallmatrix}\right )$ .
Let R be an Eichler order in B, that is, an intersection of two maximal orders. We denote by N the level of R. It is a natural number, coprime to $d_B$ , characterized as follows: For each prime $p \nmid d_B$ , there is an isomorphism under which maps to the order $\left ( \begin {smallmatrix} \mathbb {Z}_p &\mathbb {Z}_p \\ N \mathbb {Z}_p& \mathbb {Z}_p \end {smallmatrix} \right )$ . We may then identify $G(\mathbb {Q}_p)$ with $\operatorname {PGL}_2(\mathbb {Q}_p)$ and the image of $R_p^\times $ with a finite index subgroup of $\operatorname {PGL}_2(\mathbb {Z}_p)$ . We assume that N is squarefree so that $d_B N$ is likewise squarefree. We denote by $K_R$ the compact open subgroup of $G(\mathbb {A}_f)=\prod _p ' G(\mathbb {Q}_p)$ given by the image of $\prod _p R_p^\times $ .
Fix $k \in 2\mathbb {Z}$ . Let $\mathcal {A}$ denote the set of cusp forms $\varphi : [G] \rightarrow \mathbb {C}$ having the following properties:

∘ $\varphi (g \kappa (\theta )) = e^{i k \theta } \varphi (g)$ for all $\theta $ .

∘ $\varphi $ is an eigenfunction for some fixed Casimir operator for $G(\mathbb {R})$ , with eigenvalue $\lambda _{\varphi }$ . For the sake of concreteness, we scale the Casimir operator such that it agrees with the standard Laplace operator on the locally symmetric space $G(\mathbb {R}) / K_{\infty }$ , which identifies with either $\mathbb {H}$ or $S^2$ .

∘ $\varphi $ is $K_R$ invariant: $\varphi (g k) = \varphi (g)$ for $k \in K_R$ .

∘ $\varphi $ belongs to the newspace for R, that is, $K_R$ is the largest subgroup of $G(\mathbb {A}_f)$ keeping $\varphi $ invariant. Equivalently, $\varphi $ is orthogonal the space of $K_{R'}$ invariant cusp forms for every Eichler order $R'$ strictly containing R.

∘ $\varphi $ is an eigenform for almost all Hecke operators.
If $k \ge 2$ , then we write $\mathcal {A}^{\operatorname {hol}} \subseteq \mathcal {A}$ for the subspace of automorphic lifts of holomorphic forms or, equivalently, the kernel of the raising (resp. lowering) operator attached to $K_\infty $ if B is definite (resp. indefinite).
Denote by $\mathcal {F}$ a maximal orthonormal subset of $\mathcal {A}$ . Analogously, we define $\mathcal {F}^{\operatorname {hol}} \subseteq \mathcal {A}^{\operatorname {hol}}$ if $k \ge 2$ . Because of the multiplicityone theorem for $\operatorname {GL}_2$ and its inner forms, the bases $\mathcal {F}, \mathcal {F}^{\operatorname {hol}}$ are unique up to rescaling each element by a scalar of unit magnitude. We note that the sets $\mathcal {A}$ , $\mathcal {A}^{\operatorname {hol}}$ , $\mathcal {F}$ and $\mathcal {F}^{\operatorname {hol}}$ depend on k; while we suppress this dependence from the notation, k is one of the main parameters of interest.
We will consider several subfamilies of $\mathcal {F}$ and $\mathcal {F}^{\operatorname {hol}}$ . Here, a minus sign in the exponent signifies the indefinite case, a plus sign the definite case.

∘ If B is indefinite and $k=0$ , then we take and let $\mathcal {F}^{}_{\lambda }$ (resp. $\mathcal {F}^{}_{\le L}$ ) denote the subsets defined by taking the Casimir eigenvalue equal to $ \lambda $ (respectively at most L in magnitude).

∘ If B is indefinite and $k \ge 2$ , then we take .

∘ If B is definite and $k=0$ , then we let $\mathcal {F}^{+}_m \subset \mathcal {F}$ be the subset of forms, whose associated automorphic representation at infinity is isomorphic to the unique irreducible unitary representation of $\operatorname {\mathrm {SU}}_2(\mathbb {C})$ of degree $m+1$ . In other words, their eigenvalue with respect to the Casimir operator equals to $m(m+1)$ .

∘ If B is definite and $k \ge 2$ , then we let $\mathcal {F}^{+, \operatorname {hol}}=\mathcal {F}^{\operatorname {hol}}$ .
2.2. The split case
Assume for the moment that B is split. We may suppose then that
and may identify
We define
as follows. Let $A_0(N)<\operatorname {GL}_2(\mathbb {Q})^+$ denote the group generated by $\Gamma _0(N)$ and all Atkin–Lehner operators. If $g \in [G] / K_\infty K_R$ identifies with $z \in \Gamma _0(N) \backslash \mathbb {H}$ , then we set
Since the Atkin–Lehner operators constitute scaling matrices for the various cusps of $\Gamma _0(N)$ (cf. §4.3.1), the function H may be understood as a normalized height or as quantifying closeness to the cusps. Let $\mathfrak {a} \in P^{1}(\mathbb {Z})$ be a cusp of $\Gamma _0(N)$ , and let $\sigma _{\mathfrak {a}}\in \operatorname {\mathrm {SL}}_2(\mathbb {Z})$ such that $\sigma _{\mathfrak {a}} \infty = \mathfrak {a}$ . Then,
where $\mathfrak {a}$ runs over all cusps of $\Gamma _0(N)$ , $z_{\mathfrak {a}}= \sigma _{\mathfrak {a}}^{1} z$ and $w_{\mathfrak {a}}$ is the cusp width of $\mathfrak {a}$ .
2.3. Results on forms
We adopt the following asymptotic notation $\preccurlyeq $ :
where $\mu $ is a quantity relating to the eigenvalues with respect to the Casimir operator of the automorphic forms of relevance to the inequality. Concretely, when talking about the families $\mathcal {F}^{}_{\lambda },\mathcal {F}^{}_{\le L}, \mathcal {F}^{+}_m, \mathcal {F}^{\pm ,\operatorname {hol}}$ we mean $\mu =\lambda ,L,m, k$ , respectively.
Theorem 2.1. Let $g_1,g_2 \in [G]$ . If B is indefinite, then
for $L> 0$ , and
for $k \ge 2$ even. In both cases, the term involving $H(g_{1,2})$ is only present if B is split.
If B is definite, then
for $m \in \mathbb {N}_0$ , and
for $k \in 2 \mathbb {N}$ .
Remark 2.2. In the indefinite holomorphic case (2.5), one may have the same bound for the fourth moment rather than the squared difference under the assumption that the weight satisfies $k \gg _{\eta } (d_BN)^{\eta }$ for some $\eta>0$ , in which case the implied constant also depends on $\eta $ and the implied constant in the assumed lower bound for the weight.
Corollary 2.3. For $k \ge 2$ and $\varphi \in \mathcal {F}^{\operatorname {hol}}$ , we have
For $k=0$ and $\varphi \in \mathcal {F}$ , we have
If B is definite, then we have more precisely
By a wellknown procedure, these statements may be translated into the classical language, thus giving rise to the theorems in the introduction. For further details; see, for example, [Reference BumpBum97, §3.2 & §3.6] for the indefinite case and [Reference Blomer and MichelBM13] for the definite case.
2.4. Counting problems: setup
2.4.1. Lattices locally dual to R
Let $\ell $ be a divisor of the squarefree number $d_B N$ . We denote by $R(\ell )$ the lattice in B whose local components $R(\ell )_p$ are given

∘ for p dividing $\ell $ , by the lattice $R_p^\vee \subseteq B_p$ dual to $R_p$ , and

∘ otherwise, by $R_p$ .
2.4.2. Reduced trace and norm
We denote by $\operatorname {\mathrm {tr}}$ and $\det $ the reduced trace and reduced norm on B, and also on its completions. We use a superscripted $0$ , as in $R^0$ or $R(\ell )^0$ , to denote the kernel of the reduced trace.
2.4.3. Coordinates tailored to $K_\infty $
Define . If B is indefinite, then $B_\infty \cong \operatorname {\mathrm {Mat}}_{2 \times 2}(\mathbb {R})$ is split; otherwise, $B_\infty $ is isomorphic to the real Hamilton quaternions. The exponential series identifies $B_\infty ^0$ with the Lie algebra of $G(\mathbb {R})$ . We write ${\mathbf{i}} \in B_\infty ^0$ for the derivative at the identity of $\kappa $ so that $\kappa (\theta ) = \exp ( \theta {\mathbf{i}} )$ . Then, ${\mathbf{i}} ^2 = 1$ . We may find ${\mathbf{j}} \in B_\infty ^0$ with ${\mathbf{j}} ^2 = \pm 1$ ( $+1$ if B is indefinite, $1$ if B is definite) so that $B_\infty = \mathbb {R}({\mathbf{i}} ) \oplus \mathbb {R}({\mathbf{i}} ) {\mathbf{j}} $ . We note that ${\mathbf{j}} $ is not uniquely determined, but any two choices differ by multiplication by a norm one element of $\mathbb {R}({\mathbf{i}} )$ . We set ${\mathbf{k}} = {\mathbf{i}} {\mathbf{j}} $ . Then, ${\mathbf{i}} ,{\mathbf{j}} ,{\mathbf{k}} $ give an $\mathbb {R}$ basis of $B_\infty ^0$ . For real numbers $a,b,c$ , we set . A general element of $B_\infty $ may then be written $[a,b,c] + d$ , where we identify the real number d with a scalar element of $B_\infty $ . In these coordinates,
Example 2.4. Suppose that $B_\infty = \operatorname {\mathrm {Mat}}_{2 \times 2}(\mathbb {R})$ and that $\kappa $ is as in Equation (2.2). Then, with suitable choices,
2.4.4. Archimedean regions
For $T> 0$ and $\delta \in (0,1]$ , we denote by $\Omega (\delta ,T)$ the set of all elements $[a,b,c] + d$ of $B_\infty $ for which
With $\Omega ^{\star }(\delta ,T)$ , we denote the subset of nonzero elements of $\Omega (\delta ,T)$ . Likewise, for $T>0$ and $\delta \in (0,1]$ , we let $\Psi (\delta ,T)$ denote the set of all elements $[a,b,c]+d$ of $B_{\infty }$ for which
and $\Psi ^{\star }(\delta ,T)$ its subset consisting of nonzero elements.
2.5. Counting problems: results
We adopt the following asymptotic notation for counting estimates (compare with the notation $\preccurlyeq $ introduced in §2.3):
Recall from §2.2 the height function H defined in the split case. In the nonsplit case, we adopt the convention in the following results that any terms involving H (in minima or sums) should be omitted.
Theorem 2.5 (Type I estimates).
Let $g \in G(\mathbb {R})$ . Then, the first successive minima (see Definition 6.1) of $g^{1} R(\ell )^0 g$ with respect to $\Omega (\delta , 1)\cap B_{\infty }^0$ is $\gg \min \left \{ \ell ^{\frac {1}{2}} , \ell ^{1} \delta ^{\frac {1}{2}} H(g)^{1} \right \}$ . Furthermore, we have
If B is nonsplit, we further have that the first successive minima of $g^{1} R(\ell )^0 g$ with respect to $\Psi (\delta , 1)\cap B_{\infty }^0$ is at least $\gg \ell ^{\frac {1}{2}}$ and
Theorem 2.6 (Type II estimates).
Let $g \in G(\mathbb {R})$ and $n \in \frac {1}{\ell } \mathbb {Z}$ . We have
The proof of these results occupies §7 onwards. In §3, we explain how these results imply our main fourth moment bound, Theorem 2.1.
3. Division and reduction of the proof
3.1. Traversing the genus
Recall that $K_R$ is defined as the image of the subgroup $\prod _p R_p^{\times }$ in $G(\mathbb {A}_f)$ ; it is a compact open subgroup of $G(\mathbb {A}_f)$ . In due course, we will consider the conjugated sets $h_f K_R h_f^{1}$ , for $h_f \in G(\mathbb {A}_f)$ . These are precisely the compact open subgroups $K_{R'}$ associated to the Eichler orders $R'$ in the genus of R. We note that $R'$ has the same level as R and may be given explicitly by the following intersection:
where $\widehat {\mathbb {Z}}$ denotes the closure of $\mathbb {Z}$ inside $\mathbb {A}_f$ . We further note that the action of $G(\mathbb {A}_f)$ on the genus of R commutes with partial dualization in the sense that
This observation permits us to formulate the required $L^2$ estimates for our differences of theta kernels in terms of integration over Archimedean, rather than adelic, arguments. To that end, we introduce the notation
for $h=(h_\infty ,h_f) \in G(\mathbb {A})$ . We note that for $h \in G(\mathbb {R})$ (i.e., $h_f = 1$ ), the set $R(\ell; h)$ is just $h^{1}R(\ell )h$ . Since taking the trace commutes with conjugation, we may extend the notation to kernels of the reduced trace without concern for confusion regarding the order of operation, that is,
If B is split, then the class number of R is one and we have fixed the representative as in Equation (2.1). In this case, we find for $h \in G(\mathbb {A})$ that $h^{1}R h=h^{\prime 1}Rh'$ , where $h' \in G(\mathbb {R})$ has the same image under the isomorphism $[G]/ K_{\infty }K_R \cong \Gamma _0(N) \backslash \mathbb {H}$ as h does. In particular, we have the equality of height functions (see §2.2) $H(h)=H(h')$ .
Remark 3.1. By considering a right translate of $\varphi \in \mathcal {F}$ and thereby moving the maximal compact $K_{\infty }$ and the Eichler order R around, one could reduce the statement of the main Corollary 2.3 to the case that g is the identity. However, in the split case, our counting arguments do depend on the particular order in the genus. Moreover, our method relies on a difference of theta kernels defined relative to different g. Such a reduction would thus be premature.
3.2. Estimating fourth moments via lattice sums
In §5, we introduce certain theta kernels. A spectral expansion of their $L^2$ norms will yield the fourth moments of interest, while a ‘geometric’ expansion, using Siegel domains and Fourier expansions, bounds those $L^2$ norms in terms of certain lattice sums. We now state the latter bounds.
Proposition 3.2. Suppose B is indefinite. Then, for $g_1, g_2 \in [G]$ , there exists $\ell d_BN$ , $g \in \{g_1, g_2\}$ , and $0< T \preccurlyeq \frac {(d_B N(k+1))^{\frac {1}{2}}}{\ell }$ (here, the notation $\preccurlyeq $ is as in §2.3) so that, for $k=0$ ,
while for $k> 0$ ,
Proposition 3.3. Suppose B is indefinite. Let $g \in [G]$ , and assume that $k \gg (d_BN)^{\eta }$ for some arbitrarily small $\eta>0$ . Then, there exists $\ell d_BN$ and $ 0<T \preccurlyeq \frac {(d_B N k)^{\frac {1}{2}}}{\ell }$ so that
Proposition 3.4. Suppose B is definite and the weight is $k=0$ . Then, for $g_1,g_2 \in [G]$ and $m \in \mathbb {N}_0$ , there exists $\ell d_BN$ , $ 0< T \preccurlyeq \frac {(d_B N (m+1))^{\frac {1}{2}}}{\ell }$ and $\frac {1}{m^2+1} \preccurlyeq \delta \le 1$ so that
Proposition 3.5. Suppose B is definite. Then, for $g \in [G]$ , there exists $\ell d_BN$ and $ 0< T \preccurlyeq \frac {(d_B N k)^{\frac {1}{2}}}{\ell }$ so that
3.3. Reduction to ternary lattices
In this section, we reduce the vital counting problem involving quaternary quadratic form to problems involving only ternary quadratic forms. The key observation is that we may orthogonally decompose the quaternion algebra $B_{\infty }$ into its trace part and its traceless part $B_{\infty }^{0}$ . Thus, for any $\alpha = \frac {1}{2}\operatorname {\mathrm {tr}}(\alpha ) + \alpha ^0 \in \mathbb {R} \oplus B_{\infty }^{0}$ , we have
We further note that the trace is invariant under conjugation. Hence, we have $\operatorname {\mathrm {tr}}(R(\ell ;g)) \subseteq \mathbb {Z}$ . We conclude that
is a sublattice of the direct sum of the lattices $\frac {1}{2}\mathbb {Z}$ in $\mathbb {R}$ and $\tfrac {1}{2} R(\ell ;g)^{0}$ in $B_{\infty }^0$ . Using this decomposition, we deduce
by distinguishing the two cases of equal and nonequal trace and applying the divisor bound to the equality
We remark that we have forfeited the congruence condition $\det (\gamma _1^0) \equiv \det (\gamma _2^0) \ \mathrm {mod}\,(1)$ , and this forfeiture will be reflected in the suboptimality of our final counting estimates on larger scales when $\ell>1$ . We circumnavigate these larger scales by an appropriate choice of a covering domain (cf. Lemma 4.1).
Note that we may further bound the diagonal contribution by considering its largest fiber:
Arguing along the same lines, we also arrive at
and
Note that in this last inequality, we have passed from $\Psi (\delta ,2T)$ to the larger set $\Omega (1,2T)=\Psi (1,2T)$ ; the resulting bound remains adequate for us thanks to the additional saving of $\delta ^{\frac {1}{2}}$ in Equation (3.10).
3.4. Proof of Theorem 2.1
Theorem 2.1 is an immediate consequence of the following pair of lemmas together with Propositions 3.2 through 3.5.
Lemma 3.6. We have
Proof. Recall, from the discussion of Section §3.1, that we may express $R(\ell ;g)^0$ , for $g \in G(\mathbb {A})$ , as $(g')^{1}R'(\ell )^0g'$ , where $R'$ is an Eichler order of the same level and $g' \in G(\mathbb {R})$ , with $H(g)=H(g')$ in the case that B is split. We may thus apply the results of Section §2.5.
Since $\operatorname {\mathrm {tr}}(R(\ell ;g)) \subseteq \mathbb {Z}$ , we find that the first successive minimum of $R(\ell ;g)$ with respect to $\Omega (\delta ,1)$ is at least the minimum of $1$ and the first successive minimum of $R(\ell ;g)^0$ with respect to $\Omega (\delta ,1)\cap B_{\infty }^{0}$ . The latter is $\gg \min \{ \ell ^{\frac {1}{2}}, \ell ^{1} \delta ^{\frac {1}{2}} H(g)^{1} \} =: \Lambda $ by Theorem 2.5, where the term involving $H(g)$ is to be omitted if B is nonsplit. Thus, we find that $R(\ell ;g) \cap \Omega ^{\star }(\delta ,T)$ is empty for $T \ll \Lambda $ , in which case there is nothing to show. Next, assume instead that $T \gg \Lambda $ . Then, by Theorem 2.5, we have
where we have used $1 \ll \ell ^{\frac {1}{2}}T+\ell \delta ^{\frac {1}{2}}H(g)T$ and $\ell ^{\frac {3}{2}} \le \ell ^{2}$ . Further note that the middle term in the bracket is dominated by the sum of the first and last term in the bracket. We also find by Theorem 2.6 that
We conclude the Lemma by further appealing to the inequalities (3.8) and (3.9).
Lemma 3.7. Assume that B is nonsplit. Then
Proof. As in the proof of Lemma 3.6, we find that the first successive minimum of $R(\ell ;g)$ with respect to $\Psi (\delta ,1)$ is at least the minimum of $\delta ^{\frac {1}{2}}$ and the first successive minimum of $R(\ell ;g)^0$ with respect to $\Psi (\delta ,1)\cap B_{\infty }^{0}$ . The latter is $\gg \ell ^{\frac {1}{2}}$ by Theorem 2.5. Therefore, $R(\ell ;g) \cap \Psi ^{\star }(\delta ,T)$ is empty for $T \ll \ell ^{\frac {1}{2}} \le 1 \le \delta ^{\frac {1}{2}}$ , in which case there is nothing to show. If $T \gg \ell ^{\frac {1}{2}}$ , then, by Theorem 2.5, we have
where we have used $1 \ll \ell ^{\frac {1}{2}}T$ and $\ell ^{\frac {3}{2}} \le \ell ^{2}$ . Furthermore, by Theorem 2.6, we have
where we have used $1 \le \ell ^{\frac {1}{2}}$ and $1 \le \delta ^{\frac {1}{2}}$ . We conclude the lemma by further appealing to the inequalities (3.10) and (3.11).
3.5. Proof of Corollary 2.3
Let $\varphi \in \mathcal {F}$ , respectively $\mathcal {F}^{\operatorname {hol}}$ , be $L^2$ normalized. Assume first that B is nonsplit. Then, since $[G] / K_{\infty } K_{R}$ is compact and equipped with a probability measure, we may find $g_2$ in $[G]$ such that $\varphi (g_2) \le 1$ . Hence, Corollary 2.3 follows immediately from Theorem 2.1 by positivity and the particular choice of $g_2$ .
We now turn our attention to the case that B is split, in other words when $d_B=1$ . Here, we need to supplement Theorem 2.1 with the additional information that for $H(g) \ge N^{\frac {1}{2}}$ , we have
The former is recorded in [Reference TemplierTem15, Prop. 3.1 & 3.2], for example, and the latter may be deduced from the Fourier expansion along the lines of Xia [Reference XiaXia07]. We include a brief proof here for the sake of completeness.
Lemma 3.8. Assume B is split, and let $\varphi \in \mathcal {F}^{,\mathrm {hol}}$ be an $L^2$ normalized holomorphic cuspidal newform of squarefree level N and even weight $k \ge 2$ . Then, we have for all $g \in [G]$ ,
If $H(g) \ge \frac {k}{2 \pi }$ , then we have the stronger bound
Proof. Suppose that g corresponds to $z=x+iy \in \Gamma _0(N) \backslash \mathbb {H}$ . As $\varphi (g)$ is further invariant under the Atkin–Lehner operators we may further assume that z has maximal imaginary part under the action of the group $A_0(N)$ generated by the Atkin–Lehner operators and $\Gamma _0(N)$ , thus $H(g)=y$ . We shall subsequently make use of the Fourier expansion of $\varphi $ at $\infty $ :
We may bound the Fourier coefficients by appealing to Deligne’s bound for the Hecke eigenvalues [Reference DeligneDel71, Reference DeligneDel74]. This implies $a_n \ll _{\varepsilon } n^{\frac {k1}{2}+\varepsilon } a_1$ .Footnote ^{1} We find
The above sum, we may bound by comparison to the corresponding integral. For this manner, we note that the function $x^{\alpha }e^{x}$ increases up to $x= \alpha $ and then decreases. We may also bound the first Fourier coefficient $a_1$ by a result of Hoffstein–Lockhart [Reference Hoffstein and LockhartHL94] (cf. [Reference Harcos and MichelHM06, Eq. (31)]Footnote ^{2}). The bound reads $a_1 \ll _{\varepsilon } (Nk)^{\varepsilon } (4\pi )^{\frac {k}{2}} \Gamma (k)^{\frac {1}{2}}$ . We thus arrive at
where we have made use of Stirling’s approximation. If $y \ge \frac {k}{2 \pi }$ , then the maximum summand occurs when $n=1$ and we may derive the improved bound
To deduce Equation (3.19) from the lemma, we consider separately the cases $N^{\frac {1}{2}} \le H(g) \le \frac {k}{ 2 \pi }$ and $H(g) \ge \frac {k}{2 \pi }$ , applying Equation (3.20) in the former case and Equation (3.21) in the latter.
We may now deduce the split case of Corollary 2.3, as follows. Our task is to bound $\varphi (g_1)$ suitably for $g_1 \in [G]$ . We may assume that $H(g_1) \le N^{\frac {1}{2}}$ , as otherwise the estimates (3.18) and (3.19) are adequate. In that case, we choose another point $g_2 \in [G]$ arbitrarily with $H(g_2) = N^{\frac {1}{2}}$ such that $\varphi (g_2) \preccurlyeq _{\lambda _{\varphi }} N^{\frac {1}{4}}$ , respectively $\varphi (g_2) \preccurlyeq (kN)^{\frac {1}{4}}$ , by Equation (3.18), respectively Equation (3.19). We apply Theorem 2.1 with these choices of $g_1$ and $g_2$ . Upon recalling that $d_B=1$ in the split case, we find by positivity that Equation (2.4), respectively Equation (2.5), yield
We conclude by the triangle inequality and taking square roots:
4. Arithmetic quotients as real manifolds
4.1. Measure normalizations
For indefinite B, we fix an isomorphism $G(\mathbb {R})\cong \operatorname {PGL}_2(\mathbb {R})$ sending $K_\infty $ to $\operatorname {PSO}_2(\mathbb {R})$ . We fix the Haar measure $\,\mathrm {d} g = \frac {\,\mathrm {d} y \,\mathrm {d} x}{y^2}\frac {\,\mathrm {d} \theta }{2 \pi }$ for $g=\left (\begin {smallmatrix} y^{1/2} & xy^{1/2} \\ 0 & y^{1/2} \end {smallmatrix}\right ) \kappa (\theta )$ on $\operatorname {SL}_2(\mathbb {R})$ . The pushforward of this measure to the hyperbolic plane is then the measure $\frac {\,\mathrm {d} x \,\mathrm {d} y}{y^2}$ . The Haar measure on $\operatorname {PGL}_2(\mathbb {R})$ is fixed so that its restriction to $\operatorname {PSL}_2(\mathbb {R})$ coincides with the pushforward of the Haar measure from $\operatorname {SL}_2(\mathbb {R})$ .
If B is definite, we fix an isomorphism $G(\mathbb {R})\cong \operatorname {SO}_3(\mathbb {R})$ sending $K_\infty $ to $\operatorname {SO}_2(\mathbb {R})$ . We fix a Haar measure on $\operatorname {SO}_3(\mathbb {R})$ so that the measure of the $2$ sphere is $4\pi $ .
4.2. Volumes
Recall, that we fixed the measure on $[G]$ to be the probability Haar measure. Hence, the volume of the quotient $[G]/K_R$ is $1$ . In due course, we shall also require the volume of said quotient when viewed as a real manifold with respect to our fixed Haar measure on $G(\mathbb {R})$ . More specifically, we will need the volume with respect to the measure on $G'(\mathbb {R})$ , where $G'$ is the linear algebraic group defined over $\mathbb {Q}$ whose rational points are the proper unit quaternions $B^1$ . There is an obvious isogeny map $G'\to G$ , where $G'$ is the simply connected form and G is the adjoint one. Define $R_p^1=R_p\cap G'(\mathbb {Q}_p)$ to be the proper unit quaternions in the local order $R_p$ , and set $K_R^1=\prod _{p} R_p^1$ . Then, the map $[G']/K_R^{1}\to [G]/K_R$ is a homeomorphism that pushes forward the probability Haar measure on $[G']/K_R^{1}$ to the probability Haar measure on $[G]/K_R$ ; see Lemma A.2. In general, this map is not bijective if $K_R$ is replaced by a general compact open subgroup of $G(\mathbb {A}_f)$ and the fact that the map is indeed a homeomorphism is due to $K_R$ being the projectivized group of units of an Eichler order.
By Borel’s finiteness of class numbers [Reference BorelBor63], $[G']/K_R^1$ is a finite collection of $G'(\mathbb {R})$ orbits with representatives $\delta _1,\ldots ,\delta _h\in G'(\mathbb {A})$ . Define $\Gamma _i=G'(\mathbb {Q})\cap \delta _i K_R^1 \delta _i^{1}$ ; the intersection is taken in $G'(\mathbb {A}_f)$ but regarded as a subset of $G'(\mathbb {Q})$ and hence also of $G'(\mathbb {R})$ . In particular, $\Gamma _i$ is a lattice in $G'(\mathbb {R})$ . It follows that
This is a finite disjoint union of finitevolume homogeneous spaces for the real Lie group $G'(\mathbb {R})$ . We define $\operatorname {covol}(\Gamma _i)$ to be the measure of a fundamental domain for the action of $\Gamma _i$ on $G'(\mathbb {R})$ with respect to the fixed Haar measure on $G'(\mathbb {R})$ , either $\frac {\,\mathrm {d} x \,\mathrm {d} y}{y^2} \frac {\,\mathrm {d} \theta }{2\pi }$ in the indefinite case or the measure giving volume $4\pi $ to in the definite case. We finally set
If B is indefinite, then $G'(\mathbb {R})\cong \operatorname {SL}_2(\mathbb {R})$ is noncompact and strong approximation implies that $h=1$ and we can write $[G]/K_R\cong \Gamma \backslash \operatorname {SL}_2(\mathbb {R})$ . In this case, V is the volume of the hyperbolic surface with respect to the volume form $\frac {\,\mathrm {d} x \,\mathrm {d} y}{y^2}$ . If B is definite, then in general h can be large and $[G]/K_\infty K_R$ is a finite collection of quotients of $2$ spheres by discrete rotation groups.
Recall that we have denoted by $d_B$ the reduced discriminant of B and by N the squarefree level of the Eichler order R. The volume is given in both cases by
This follows from a corresponding mass formula; see [Reference VoightVoi18, Thm 39.1.8] in the indefinite case and [Reference VoightVoi18, Thm 25.1.1 & Thm 25.3.18] in the definite one. The space is furthermore compact if and only if B nonsplit, that is, $d_B> 1$ .
4.3. Siegel domains
The main purpose of this section is to provide a specific Siegeldomain covering in order to bound the $L^2$ norm of the (difference of) theta kernels in §5.2. Let M be a squarefree natural number and set $U = \left (\begin {smallmatrix} \widehat {\mathbb {Z}} & \widehat {\mathbb {Z}} \\ M \widehat {\mathbb {Z}} & \widehat {\mathbb {Z}} \end {smallmatrix}\right ) \cap \operatorname {\mathrm {SL}}_{2}(\mathbb {A}_f)$ . The theta functions of interest will turn out to be right invariant in the symplectic variable under U with $M = d_B N$ , but the present discussion applies to any squarefree M.
4.3.1. Cusps and Atkin–Lehner operators
Since M is squarefree, a representative set of cusps for $\Gamma _0(M)$ is given by the ratios $\frac {\ell }{M}$ , where $\ell $ runs through the positive divisors of M. The width of the cusp $\frac {\ell }{M}$ (understood here as with respect to the group $\Gamma _0(M)$ ) is given by $\ell $ [Reference IwaniecIwa97, §2.4]. For each $\ell M$ , we choose an element $\tau _{\ell } \in \operatorname {\mathrm {SL}}_2(\mathbb {Z})$ satisfying
Then, the cusp $\tau _{\ell } \infty $ is $\Gamma _0(M)$ equivalent to the cusp $\frac {\ell }{M}$ . Hence, writing $n(x)=\left (\begin {smallmatrix} 1 & x\\ & 1 \end {smallmatrix}\right )$ , we see that the elements $\tau _{\ell }n(j)$ , where $j=0,\dots ,\ell 1$ and $\ell M$ , give a complete system of representatives for $\Gamma _0(M) \backslash \operatorname {\mathrm {SL}}_{2}(\mathbb {Z})$ . The normalized matrices , where $a(y)= \operatorname {\mathrm {diag}}(y^{\frac {1}{2}},y^{\frac {1}{2}})$ , are scaling matrices for the respective cusps. Furthermore, the matrices $\tilde {\tau _{\ell }}$ are Atkin–Lehner operators for $\Gamma _0(M)$ and give a set of representatives for $A_0(M) / \Gamma _0(M)$ [Reference Atkin and LehnerAL70, Lemma 9].
4.3.2. Coverings
The basic idea of the following lemma is to apply the Fricke involution to the tiling of $\Gamma _0(M) \backslash \mathbb {H}$ by translates of the standard fundamental domain for $\operatorname {\mathrm {SL}}_2(\mathbb {Z}) \backslash \mathbb {H}$ .
Lemma 4.1. Let $F : [\operatorname {\mathrm {SL}}_2] \rightarrow \mathbb {R}_{\ge 0}$ be a measurable function that is right U invariant and of weight $0$ . Then,
Here, $(\tau )_{\infty }$ denotes the image of $\tau $ in the Archimedean coordinate of $\operatorname {\mathrm {SL}}_{2}(\mathbb {A})$ .
Proof. Let $f: \mathbb {H} \to \mathbb {R}_{\ge 0}$ be given by
Then, f is $\Gamma _0(M)$ invariant on the left and we have
The standard Siegel set $\{z \in \mathbb {H}  \ 0\le \Re (x)\le 1 \text { and } \Im (z) \ge \frac {\sqrt {3}}{2}\}$ contains a fundamental domain for $\operatorname {\mathrm {SL}}_2({\mathbb {Z}})$ . Using that the $\tau _{\ell }n(j)$ , for $j=0,\dots ,\ell 1$ and $\ell M$ , form a representative set for $\Gamma _0(M) \backslash \operatorname {\mathrm {SL}}_2(\mathbb {Z})$ and that f is nonnegative, we may bound
Since $\tilde {\tau }_{M} \tilde {\tau }_{\ell } = \gamma \tilde {\tau }_{\frac {M}{\ell }}$ for some $\gamma \in \Gamma _0(M)$ , we have the identity
Substituting this identity above and applying the change of variables $\frac {M}{l^2}z \mapsto z$ gives the desired result.
5. Theta kernels and their $L^2$ norms
5.1. Theta kernels and lifts
In this section, we summarize the required results on theta kernels and their lifts. The necessary theory is developed in Appendix A.
5.1.1. Theta functions
The theta kernels constructed in Appendix A are modular functions on $\operatorname {O}_{\det }(\mathbb {A})\times \operatorname {\mathrm {SL}}_2(\mathbb {A})$ . The group G acts by conjugation on the quadratic space $(B,\det )$ , preserving the quadratic form. This gives an embedding $G\hookrightarrow \operatorname {O}_{\det }$ . We are mainly concerned here with the pullback of the theta kernels to $G(\mathbb {A})\times \operatorname {\mathrm {SL}}_2(\mathbb {A})$ . We denote that pullback by $\Theta (g,s)$ . The function $\Theta (g,s)$ is right $K_R \times U_R^1$ invariant, where and of moderate growth. We caution that it is not a theta kernel for the Howe dual pair of the orthogonal group of the traceless quaternions and $\operatorname {\mathrm {SL}}_2$ .
We shall require several explicit expressions of the theta kernels $\Theta $ . Define functions $P, u, X$ on $B_\infty $ by setting, for $\gamma = [a,b,c] + d \in B_\infty $ ,
In other words, by identifying ${\mathbf{i}} \in B_{\infty }$ with $i \in \mathbb {C}$ , we have that

∘ X is the projection from $B_\infty = \mathbb {C} \oplus \mathbb {C} {\mathbf{j}} $ to the summand $\mathbb {C}$ ,

∘ u is the squared magnitude of the projection onto the other summand $\mathbb {C} {\mathbf{j}} $ and

∘ P is the sum of the squared magnitudes of the two projections.
Upon recalling the notation $R(\ell ;g)$ from Section §3.1, we define for $g \in G(\mathbb {A})$ and $z=x+iy \in \mathbb {H}$ the theta functions
if $k \ge 6$
where $P_{m}$ is the mth Legendre polynomial. When $\ell = 1$ , we abbreviate by dropping the subscript, for example, . We are now ready to express $\Theta $ by means of strong approximation. Set
Then, $\Theta (g, s_{\infty }U_R^1) = \theta _g(z) e^{i \kappa \theta }$ for some $\kappa \in 2\mathbb {Z}$ and a choice of $\theta _{g}$ from Equations (5.2)–(5.5). The precise choice and value of $\kappa $ depends on the family $\mathcal {G}$ under consideration and may be read off Table 1. (For our study of $\mathcal {F}^{,\operatorname {hol}}$ , the precise choice of $\Theta $ depends upon the size of k.)
Antipating the application of Lemma 4.1, we further require Fourier–Whittaker expansions of $\Theta $ at all of the cusps. They are expressable in terms of the $\theta _{g,\ell }$ from Equations (5.2)–(5.5) and a weight $\kappa $ , the choice of which are given by the Table 1 as before. We have
for $\ell  d_B N$ with $\mu $ the Möbius function, $\tau _{\ell }$ as in Equation (4.2), and where $(\tau _{\ell })_{\infty }$ denotes the image of $\tau _{\ell }$ in the Archimedean coordinate of $\operatorname {\mathrm {SL}}_{2}(\mathbb {A})$ .
5.1.2. Jacquet–Langlands lifts
Set $U_R$ to be the image of
in $\operatorname {PGL}_2(\mathbb {A}_f)$ . This is a compact open subgroup of $\operatorname {PGL}_2(\mathbb {A}_f)$ . For each $\varphi $ in the families $\mathcal {F}^{}, \mathcal {F}^{,\operatorname {hol}}, \mathcal {F}^{+}_m, \mathcal {F}^{+,\operatorname {hol}}$ , we consider the Jacquet–Langlands transfer $\pi ^{\operatorname {\mathrm {JL}}}$ to $[\operatorname {PGL}_2]$ of the representation $\pi $ generated by $\varphi $ . In the case that G is split, we let $\pi ^{\operatorname {\mathrm {JL}}}=\pi $ . The space of vectors in $\pi ^{\operatorname {\mathrm {JL}}}$ that are $U_R$ invariant and $K_{\infty }$ isotypical of minimal nonnegative weight is onedimensional [Reference Jacquet and LanglandsJL70, Reference CasselmanCas73]. We define the arithmetically normalized Jacquet–Langlands lift $\varphi ^{\operatorname {\mathrm {JL}}}\in L^2([\operatorname {\mathrm {SL}}_2])$ of $\varphi $ to be the $U_R^1$ invariant restrictionFootnote ^{3} to $[\operatorname {\mathrm {SL}}_2]$ of a vector in this onedimensional space, that has a Whittaker function at $\left (\begin {smallmatrix} y^{1/2} & xy^{1/2} \\ & y^{1/2} \end {smallmatrix}\right ) \in \operatorname {\mathrm {SL}}_2(\mathbb {R}) \hookrightarrow \operatorname {\mathrm {SL}}_2(\mathbb {A})$ given by

∘ $2 \sqrt {y} K_{it}(2 \pi y)e(x) \text { if } \varphi \in \mathcal {F}^{}_{\frac {1}{4}+t^2}$ , and

∘ $y^{\frac {\kappa }{2}} e(x+iy)$ if $\varphi $ is in either of the families $\mathcal {F}^{,\operatorname {hol}}$ , $ \mathcal {F}^{+}_m$ , $\mathcal {F}^{+,\operatorname {hol}}$ , where $\kappa =k$ , $2m+2$ , $k+2$ depends on the family as before.
The bounds by Hoffstein–Lockhart [Reference Hoffstein and LockhartHL94] and Iwaniec [Reference IwaniecIwa90] then imply the following bounds for the $L^2$ norm of the arithmetically normalized Jacquet–Langlands lift (cf. [Reference Harcos and MichelHM06, (30), (31)]Footnote ^{4}). One may also compare with the geometric normalization in [Reference Petrow and YoungPY19, Thm. 6.1]. If B is indefinite and $\varphi \in \mathcal {F}^{}_{\frac {1}{4}+t^2}$ , we have
In the other cases, that is, when $\varphi $ lies in either of the families $\mathcal {F}^{,\operatorname {hol}}$ , $ \mathcal {F}^{+}_m$ or $\mathcal {F}^{+,\operatorname {hol}}$ , we have
where $\kappa $ depends on the family in accord with Table 1.
5.1.3. Explicit theta lifting
The key identity is summarized in the following proposition.
Proposition 5.1. Let $g \in [G]$ . Let $\mathcal {G}$ , $\Theta $ and $\kappa $ according to Table 1. Then, for $\varphi \in \mathcal {G}$ , we have
Proof. The proof is carried out in the appendix. In short, Proposition A.16 implies that for $\varphi \in \mathcal {G}$ , the theta lift $\varphi _{\Phi }$ of $\varphi $ – defined in Equation (A.5), depending upon the precise family $\mathcal {G}$ – satisfies $\varphi _{\Phi }=(V_{d_B,N})^{1}\varphi ^{\mathrm {JL}}$ . The claim then follows from Propositions A.15 and A.12.
5.2. $L^2$ norms of theta kernels
5.2.1. Proofs of Propositions 3.2 through 3.5
The proofs are similar, so we discuss the first in detail and then explain the nonoverlapping parts of the rest. Recall the notation $\preccurlyeq $ from §2.3. We denote by $\Theta ^, \Theta ^{,\operatorname {hol}}, \Theta ^{+,m}, \Theta ^{+,\operatorname {hol}}$ the various functions ‘ $\Theta $ ’ defined as in §5.1.1.
Proof of Proposition 3.2.
Let $\mathcal {G}$ denote either $\mathcal {F}^{}_{\le L}$ or $\mathcal {F}^{,\operatorname {hol}}$ according to whether $k=0$ or $k \ge 2$ . By Proposition 5.1, we may write
By Bessel’s inequality, it follows that
We now bound the righthand side of Equation (5.10) (and in particular, verify that it is finite). Since $\Theta ^{}(g,\cdot )$ is $K_{\infty }$ isotypical, Lemma 4.1 and Equation (5.6) give the bound
We insert the definition (5.2) into the inner integral and evaluate, giving
Note that the sum over i kills the contribution from $\gamma = 0$ , so we may omit that contribution in what follows. We separate the two sums by Cauchy–Schwarz and bound $X(\alpha )$ by $P(\alpha )^{\frac {1}{2}}$ , giving
We now treat the contributions from $i=1,2$ individually. We commence with the integral in the variable y. Let
denote the normalized incomplete gamma function. Setting , we find
Since $Q(s,x)$ is superpolynomially small in both s and x as soon as $x \gg s$ , we see by dyadically partitioning $\max _i\{P(\gamma _i)^{\frac {1}{2}}\}$ that Equation (5.13) is further bounded by
for any $A \ge 0$ , where $T_j=2^j, j \in \mathbb {Z}$ . By putting all of these estimates together, we arrive at
for any $A \ge 0$ . Let us recall from Equation (4.1) that $V_{d_B,N}, V_{1,d_BN} = (d_BN)^{1+o(1)}$ and that for $\varphi \in \mathcal {F}^{}_{\le L}$ we have $\\varphi ^{\operatorname {\mathrm {JL}}}\ \succcurlyeq _{L} 1$ (see Equation (5.7)). In order to conclude the first part of the proposition, we note that the range of T may be limited from above to $\preccurlyeq (d_BN)^{\frac {1}{2}} \ell ^{1}$ by the superpolynomial decay and any polynomial bound on the second moment matrix count, which was noted in §3.4 and is the subject of the remaining sections §6 through §10. The second part of the proposition follows along the same lines, but we need to use the bound $\\varphi ^{\operatorname {\mathrm {JL}}}\^2 \succcurlyeq \Gamma (k) (4 \pi )^{k}$ for $\varphi \in \mathcal {F}^{,\operatorname {hol}}$ instead (see Equation (5.8)).
Proof of Proposition 3.3.
We follow the recipe of the previous proof, only this time for the family $\mathcal {F}^{,\operatorname {hol}}$ to which the theta function $\Theta ^{,\operatorname {hol}}$ corresponds. As we shall see, the latter already possesses a finite $L^2$ norm. Hence, we need not consider a difference of theta functions. After the initial steps, we arrive at
We further simplify using the lower bound $\\varphi ^{\operatorname {\mathrm {JL}}}\^2 \succcurlyeq \Gamma (k) (4 \pi )^{k}$ (see Equation (5.8)), the approximations $V_{d_B,N}, V_{1,d_BN} = (d_B N)^{1+o(1)}$ and the superpolynomial decay of normalized incomplete gamma function, as well as the identities
We obtain
for any $A \ge 0$ . By the triangle inequality, we reduce to estimating similar expressions but with the sum over $\gamma $ restricted by one of the following conditions:

(i) $u(\gamma ) \le k^{1+\varepsilon } \det (\gamma )$ ,

(ii) $k^{1+\varepsilon }\det (\gamma ) \le u(\gamma ) \le \det (\gamma )$ or

(iii) $\det (\gamma ) \le u(\gamma )$ .
In Case (i), we bound $(1+u(\gamma )/n)^{\frac {k}{2}} \le 1$ . Furthermore, we have $\det (\gamma ) \asymp P(\gamma )$ and $u(\gamma ) \ll k^{1+\epsilon } P(\gamma )$ . Hence, after dyadically partitioning the range of $P(\gamma )^{\frac {1}{2}}$ , we arrive at
In Case (ii), we use that $(1+u(\gamma )/n)^{\frac {k}{2}}\le (1+k^{\varepsilon 1})^{\frac {k}{2}}$ has superpolynomial decay in k (and hence also in $(d_BN)$ as $k \gg _{\eta } (d_BN)^{\eta }$ by assumption). As in Case (i), we have $ \det (\gamma ) \asymp P(\gamma )$ , but this time only $u(\gamma )\le P(\gamma )$ . We arrive at a contribution of
In Case (iii), we bound
The factor $2^{\frac {k}{4}}$ we use for superpolynomial decay in $(kd_BN)$ as before. The other factor we use as follows
Hence, Equation (5.15) is bounded by
where we have dyadically partitioned $\max _i\{P(\gamma _i)^{\frac {1}{2}}\} \asymp \sqrt {u(\gamma _1)+u(\gamma _2)}$ . The proof of the proposition is now concluded as in the previous case.
Proof of Proposition 3.4.
We treat the definite spherical case in the same spirit as the indefinite spherical case. We readily arrive at the estimate
We simplify the inequality by using the lower bound $\\varphi ^{\operatorname {\mathrm {JL}}}\^2 \succcurlyeq \Gamma (2m+2) (4 \pi )^{2m2}$ (see Equation (5.8)), the approximations $V_{d_B,N}, V_{1,d_BN}=(d_BN)^{1+o(1)}$ and the superpolynomial decay of the normalized incomplete Gamma function Q. We obtain
for any $A \ge 0$ . We proceed further by appealing to the Bernstein inequality [Reference BernsteinBer31] for the Legendre polynomials:
We recall that $\det (\gamma )= X(\gamma )^2 + u(\gamma )$ so that, with $\gamma $ and n as in the above sum,
Dyadically partitioning $P(\gamma )^{\frac {1}{2}}=\det (\gamma )^{\frac {1}{2}}$ , we conclude that Equation (5.17) is bounded by
where $T_j = 2^j$ as before. The minimum in Equation (5.19) lies between $\asymp (m+1)^{\frac {1}{2}}$ and $1$ . Let us consider the $\gamma \in \Omega ^{\star }(1,2T_j)\Omega ^{\star }(1,T_j)$ for which
for some given $\delta $ with $1/(m+1)^2 \ll \delta \le 1$ . In particular, $X(\gamma )^2 \cdot u(\gamma ) \ll \delta T_j^4$ . Since $X(\gamma )^2+u(\gamma )=P(\gamma ) \asymp T_j^2$ , both cannot be simultaneously small. Hence,
Thus, after replacing $\delta $ with its multiple by a scalar of the form $\asymp 1$ if needed (which has no affect on Equation (5.20)), we may assume that $\min \{X(\gamma )^2, u(\gamma )\} \le \delta (2T_j)^2$ , that is, that $\gamma $ lies in either $\Omega ^{\star }(\delta ,2T_j)$ or $\Psi ^{\star }(\delta ,2T_j)$ . We now consider dyadic scales $\delta _a$ of $\delta $ ’s between $\asymp 1/(m^2+1)$ and $1$ . The just mentioned arguments then allow us to bound second line in Equation (5.19) by
There are at most $ \preccurlyeq 1$ dyadic scales in the range $1/(m^2+1) \ll \delta \le 1$ . Thus, after applying Cauchy–Schwarz in order to pull out the sum over $\delta _a$ , we bound Equation (5.19) by
The proof is once more concluded as it was for the first proposition.
Proof of Proposition 3.5.
One final time we iterate the initial steps for the holomorphic family $\mathcal {F}^{\operatorname {hol}}$ in the definite case. We are again in the situation where the theta kernel $\Theta ^{+,\operatorname {hol}}$ has finite $L^2$ norm, so we obtain, without having to take differences,
We simplify this estimate using the lower bound $\\varphi ^{\operatorname {\mathrm {JL}}}\^2 \succcurlyeq \Gamma (k+2) (4 \pi )^{k2}$ (see Equation (5.8)), the approximations $V_{d_B,N}, V_{1,d_BN}=(d_BN)^{1+o(1)}$ , the superpolynomial decay of the normalized incomplete gamma function Q and the identity $X(\gamma )^2=\det (\gamma )u(\gamma )$ . We thereby obtain
for any $A \ge 0$ . We dyadically partition $P(\gamma )^{\frac {1}{2}}=\det (\gamma )^{\frac {1}{2}}$ and distinguish the two cases:

(i) $u(\gamma ) \le k^{1+\varepsilon } \det (\gamma )$ , and

(ii) $k^{1+\varepsilon }\det (\gamma ) \le u(\gamma )$ .
We separate them by the triangle inequality. Using the inequality $(1u(\gamma )/n)\le 1$ , we see that the contribution of the first case is bounded by
where $T_j=2^j, j \in \mathbb {Z}$ . In the second case, we see that $(1k^{\varepsilon 1})^{\frac {k}{2}}$ enjoys superpolynomial decay in k. The contribution of the second case is thus bounded by
The proof is now concluded as it was for the previous propositions.
6. Preliminaries on the geometry of numbers
6.1. Bounds on successive minima
Definition 6.1. Let V be an ndimensional real vector space. Let $L \subseteq V$ be a lattice (i.e., a cocompact discrete subgroup). Given a compact convex $0$ symmetric subset $\mathcal {K}$ of V with nonempty interior, we define a function $N : V \rightarrow \mathbb {R}_{\ge 0}$ by $N(v) := \inf \{t> 0 : v \in t \mathcal {K} \}$ . Given a positivedefinite quadratic form Q on V, we define such a function by $N(v) := Q(v)^{1/2}$ , or equivalently, by applying the previous definition with $\mathcal {K}$ the unit ball for Q. In either case, we define the successive minima $\lambda _1 \le \dotsb \le \lambda _n$ of $\mathcal {K}$ on L (or of Q on L) as: $\lambda _k$ is the smallest positive real for which there is a linearly independent subset $\{v_1,\dotsc ,v_k\}$ of L for which $N(v_j) \le \lambda _k$ for each $1 \le j \le k$ .
Lemma 6.2. Let $z\in \mathbb {H}$ with maximal imaginary part under the orbit of the Atkin–Lehner operators $A_0(N)$ of $\Gamma _0(N)$ with N squarefree. Then, we have
for any $(c,d) \in \mathbb {Z}^2$ distinct from $(0,0)$ .
Proof. This is essentially [Reference Harcos and TemplierHT12, Lemma 1]. That reference gives the slightly weaker bound obtained by omitting the factor $(c,N)$ , but the stronger bound that we have stated follows from their proof, keeping track of $(c,N)$ at each step rather than bounding it from below by $1$ .
6.2. Lattice counting
Lemma 6.3. Let $f_{\mathcal {K}}$ be the distance function of a closed convex $0$ symmetric set $\mathcal {K}\subseteq \mathbb {R}^n$ of positive volume. Let $\Lambda \subset \mathbb {R}^n$ be a lattice, and let $ \lambda _1 \le \lambda _2 \le \dots \le \lambda _n$ denote the successive minima (see Definition 6.1) of $\mathcal {K}$ on $\Lambda $ . Then, there is a basis $v_1,\dots ,v_n$ of $\Lambda $ such that $f_{\mathcal {K}}(v_i) \asymp _n \lambda _i$ .
Proof. This is [Reference Gruber and LekkerkerkerGL87, Thm. 2, p. 66].
Lemma 6.4. Let $\mathcal {K}\subseteq \mathbb {R}^n$ be a closed convex $0$ symmetric set of positive volume. Let $\Lambda \subset \mathbb {R}^n$ be a lattice, and let $ \lambda _1 \le \lambda _2 \le \dots \le \lambda _n$ denote the successive minima of $\mathcal {K}$ on $\Lambda $ . Then
Proof. The lower bound follows from van der Corput’s generalization of Minkowski’s first theorem [Reference van der CorputvdC36]. It states that for $\mathcal {K}' \subset \mathbb {R}^d$ a closed convex $0$ symmetric set and $\Lambda \subset \mathbb {R}^d$ a lattice, one has
Let d be the largest integer such that $\lambda _d \le 1$ . Let $v_i\in \Lambda $ , for $i=1,\dots ,d$ , be a set of linearly independent vectors such that $\lambda _i^{1}v_i \in \mathcal {K}$ . Let $\mathcal {K}'$ be the convex hull of the vectors $\pm \lambda _i^{1}v_i$ and $\Lambda '$ the span of the vectors $v_i$ . In particular, $\mathcal {K} '$ is nonempty, hence $0 \in \mathcal {K}' \cap \Lambda '$ , and so
Using Equation (6.1), it follows now that
For the upper bound, we refer to [Reference Betke, Henk and WillsBHW93, Prop. 2.1].
Lemma 6.5. Let $\Lambda \subset \mathbb {R}^2$ be a lattice of rank $2$ and $B\subseteq \mathbb {R}^2$ a ball of radius R (not necessarily centred at $0$ ). If $\lambda _1 \le \lambda _2$ are the successive minima of $\Lambda $ , then
Proof. See [Reference Harcos and TemplierHT13, Lemma 2.1].
7. Local preliminaries on orders
7.1. Quadratic preliminaries
Let F be a nonArchimedean local field of characteristic $\neq 2$ . Let $E/F$ be a separable quadratic extension, thus E is either the split quadratic extension $F \oplus F$ or a quadratic field extension. We write $\mathfrak {o}$ (resp. $\mathfrak {o}_E$ ) for the ring of integers in F (resp. E), $x \mapsto \bar {x}$ for the canonical involution on E and
for the norm and trace. Recall that the different ideal $\mathfrak {d}$ for this extension is the smallest $\mathfrak {o}_E$ ideal for which $\operatorname {\mathrm {tr}}(\mathfrak {d}^{1}) \subseteq \mathfrak {o}$ , and in fact $\mathfrak {d}^{1} = \{ x \in E: \operatorname {\mathrm {tr}}(x \mathfrak {o}_E) \subseteq \mathfrak {o} \}$ . If $E/F$ is split or unramified, then $\mathfrak {d} = \mathfrak {o}_E$ .
We may regard E as a twodimensional vector space over F.
Let $q : E \rightarrow F$ be a nondegenerate binary quadratic form with the property that for all $e,x \in E$ , we have $q(e x) =\operatorname {\mathrm {nr}}(e) q(x)$ . In other words, q is an Fmultiple of $\operatorname {\mathrm {nr}}$ , specifically $q = q(1)\operatorname {\mathrm {nr}}$ .
For $x,y \in E$ , we set so that $q(x) = \langle x, x \rangle /2$ .
Let $\mathfrak {a} \subset E$ be a fractional $\mathfrak {o}_E$ ideal. Write $\mathfrak {a}^\vee $ for the dual of $\mathfrak {a}$ with respect to the quadratic form q, that is, .
Let $\mathfrak {n}$ denote the fractional $\mathfrak {o}$ ideal generated by $q(\mathfrak {a})$ .
Lemma 7.1. We have $\mathfrak {a} = \mathfrak {d} \mathfrak {n} \mathfrak {a}^\vee $ .
Proof. Let $\alpha $ be a generator of $\mathfrak {a}$ . Then $\mathfrak {n} = q(1)\operatorname {\mathrm {nr}}(\mathfrak {a}) = \mathfrak {o} q(1) \alpha \bar {\alpha }$ , $\mathfrak {a}^\vee = \{ q(1)^{1} x: x \in E, \operatorname {\mathrm {tr}}(x \bar {\mathfrak {a}}) \subseteq \mathfrak {o} \} = q(1)^{1} \bar {\alpha }^{1} \mathfrak {d}^{1}$ . Multiplying through, the conclusion follows.
Corollary 7.2. Suppose that $E/F$ is unramified and that q is integral on $\mathfrak {a}$ so that $\mathfrak {a} \subseteq \mathfrak {a}^\vee $ . Then the elementary divisors for the $\mathfrak {o}$ module inclusion $\mathfrak {a} \hookrightarrow \mathfrak {a}^\vee $ are $(\mathfrak {n}, \mathfrak {n})$ .
Proof. Our hypotheses imply that $\mathfrak {d} = \mathfrak {o}$ and that $\mathfrak {n}$ is an integral ideal. The lemma implies that there is an isomorphism (first of $\mathfrak {o}_E$ modules, then of $\mathfrak {o}$ modules) $\mathfrak {a}^\vee / \mathfrak {a} \cong \mathfrak {o}_E / \mathfrak {n} \mathfrak {o}_E \cong (\mathfrak {o}/\mathfrak {n})^2$ , whence the conclusion.
Remark 7.3. Under the hypotheses of the corollary, the discriminant ideal of the binary quadratic form $(q,\mathfrak {a})$ is $\mathfrak {n}^2$ . More generally, under the hypotheses of the lemma, the discriminant ideal is $\mathfrak {D} \mathfrak {n}^2$ , with $\mathfrak {D} =\operatorname {\mathrm {nr}} (\mathfrak {d})$ . Conversely, given the discriminant ideal, we may compute $\mathfrak {n}$ as its square root.
7.2. Quaternionic preliminaries: general case
Let F be a nonArchimedean local field, let B be a quaternion Falgebra and let E be a separable quadratic Fsubalgebra of B. We equip B with the quadratic form $q : B \rightarrow F$ given by the reduced norm, whose bilinearization $\langle \, , \rangle $ as above is described by the reduced trace and the main involution on B via the formula $\langle x, y \rangle = \operatorname {\mathrm {tr}}(x \bar {y})$ . We have a canonical decomposition $B = E \oplus E^\perp $ , where $E^\perp = \{ x \in B : \langle x, y \rangle = 0 \text { for all } y \in E \}$ .
Let $\mathfrak {o}$ and $\mathfrak {o}_E$ denote the respective maximal orders of F and E. We write $\mathfrak {d}$ for the different ideal, as before.
Let us say that an order R in B is Eadapted if it is of the form $R = \mathfrak {o}_E \oplus \mathfrak {a}$ for some $\mathfrak {o}_E$ submodule $\mathfrak {a}$ of $E^\perp $ (for the action by either left or right multiplication – it doesn’t matter which because they are conjugates of each other).
Consider such an order R. Its traceless submodule is given by
We aim to compute the dual lattice $(R^0)^\vee $ with respect to q. To that end, it suffices to dualize each summand in the above decomposition because $(R^0)^\vee = (\mathfrak {o}_E^0)^\vee \oplus \mathfrak {a}^\vee $ . We generally have ${(\mathfrak {d}^{1})^0 \subseteq (\mathfrak {o}_E^0)^\vee \subseteq \frac {1}{2} (\mathfrak {d}^{1})^0}$ . If E is unramified or split, then $(\mathfrak {o}_E^0)^\vee = \frac {1}{2} \mathfrak {o}_E^0 = \frac {1}{2} (\mathfrak {d}^{1})^0$ . On the other hand, we can compute $\mathfrak {a}^\vee $ using the results of the previous section. Indeed, the choice of any invertible element $j \in E^\perp $ defines an isomorphism $E \rightarrow E^\perp $ , $x \mapsto x j$ . Transporting q and $\mathfrak {a}$ via the inverse of this isomorphism gives us a fractional ideal in E and a quadratic form on E that satisfy the hypotheses of that section. We obtain
where $\mathfrak {n}$ is the integral $\mathfrak {o}$ ideal characterized by either of the following properties:

∘ $\mathfrak {n}$ is generated by $q(\mathfrak {a})$ .

∘ $\mathfrak {D} \mathfrak {n}^2$ is the discriminant ideal of