1 Introduction
The general problem we consider in this paper can be phrased by the following question: how large can a set be if it does not contain a given geometrical configuration?
The simplest and most well-studied instance of this problem concerns forbidden configurations of only two points on $\mathbb {R}^d$ , which are then characterized by their distance; since there clearly exist unbounded sets on $\mathbb {R}^d$ which do not span a given distance, the appropriate notion of ‘largeness’ must take into account their density rather than their cardinality or measure. Define the upper density $\overline {d}(A)$ of a measurable set $A \subseteq \mathbb {R}^d$ by
where $\textrm{vol}$ denotes the Lebesgue measure. Our general problem in this case becomes: what is the maximum upper density that a subset of $\mathbb {R}^d$ can have if it does not contain pairs of points at distance $1$ ?Footnote 1
This extremal density is commonly denoted $m_1(\mathbb {R}^d)$ , and it is associated to the measurable chromatic numberFootnote 2 $\chi _m(\mathbb {R}^d)$ of the Euclidean space by the simple inequality $\chi _m(\mathbb {R}^d) \geq 1/m_1(\mathbb {R}^d)$ . Indeed, if no colour class contains pairs of points at unit distance, then each of them has upper density at most $m_1(\mathbb {R}^d)$ , and it takes at least $1/m_1(\mathbb {R}^d)$ such classes to cover the whole space. The parameter $m_1(\mathbb {R}^d)$ is many times studied in the context of providing lower bounds for the measurable chromatic number.
Despite significant research on the subject, there is still no dimension $d \geq 2$ for which the value of $m_1(\mathbb {R}^d)$ is known. As far back as 1982, Erdős [Reference Erdős9] conjectured that $m_1(\mathbb {R}^2) < 1/4$ , implying that any measurable planar set covering one fourth of the Euclidean plane contains pairs of points at unit distance; this conjecture is still open. A celebrated theorem of Frankl and Wilson [Reference Frankl and Wilson11] implies that $m_1(\mathbb {R}^d)$ decays exponentially with the dimension and obtains the asymptotic upper bound $m_1(\mathbb {R}^d) \leq (1.2 + o(1))^{-d}$ . We refer the reader to Bachoc, Passuello and Thiery [Reference Bachoc, Passuello and Thiery1] and to DeCorte, Oliveira and Vallentin [Reference DeCorte, de Oliveira Filho and Vallentin6] for the best known bounds on $m_1(\mathbb {R}^d)$ and $\chi _m(\mathbb {R}^d)$ .
The situation becomes even more complex and interesting when one forbids multiple distances $r_1, \dots , r_n> 0$ ; let us denote by $\mathbf {m}_{\mathbb {R}^d}(r_1, \dots , r_n)$ the maximum upper density of a set in $\mathbb {R}^d$ avoiding all of these distances. This parameter was first studied by Székely [Reference Székely22, Reference Székely23] in connection with the chromatic number of geometric graphs, and it depends not only on the dimension of the space and number of forbidden distances but also on how these distances relate to each other.
In his first paper, Székely pondered on the connection between the structure of a set of forbidden distances and the maximum density of a set in Euclidean space which avoids them all, and conjectured that $\mathbf {m}_{\mathbb {R}^2}\big ((r_j)_{j \geq 1}\big ) = 0$ whenever the sequence $(r_j)_{j \geq 1}$ of forbidden distances is unbounded. His conjecture was proven by Furstenberg, Katznelson and Weiss [Reference Furstenberg, Katznelson and Weiss12] using methods from ergodic theory, who obtained the following result:
Theorem 1. If $A \subseteq \mathbb {R}^2$ has positive upper density, then there is some number $t_0$ such that for any $t \geq t_0$ , one can find a pair of points $x, y \in A$ with $\|x - y\| = t$ .
Using Fourier analytic methods, Bourgain [Reference Bourgain2] was then able to generalize this theorem from two-point configurations on $\mathbb {R}^2$ to d-point configurations in general position on $\mathbb {R}^d$ , for any $d \geq 2$ . For convenience, we shall say that a configuration $P \subset \mathbb {R}^d$ is admissible if it has at most d points and spans a $(|P|-1)$ -dimensional affine hyperplane. Bourgain showed the following:
Theorem 2. Suppose $P \subset \mathbb {R}^d$ is admissible. If $A \subseteq \mathbb {R}^d$ has positive upper density, then there is some number $t_0> 0$ such that A contains a congruent copy of $t \!\cdot \! P$ for all $t \geq t_0$ .
This result motivates the introduction of the independence density of a given family of configurations $P_1, P_2, \dots , P_n \subset \mathbb {R}^d$ , denoted $\mathbf {m}_{\mathbb {R}^d}(P_1,\, P_2,\, \dots ,\, P_n)$ , as the maximum upper density of a set in $\mathbb {R}^d$ which does not contain a congruent copy of any of these configurations. This parameter generalizes our earlier notion of extremal density $\mathbf {m}_{\mathbb {R}^d}(r_1, \dots , r_n)$ from two-point to higher-order configurations, and can be seen as the natural analogue of the independence numberFootnote 3 for the (infinite) geometrical hypergraph on $\mathbb {R}^d$ whose edges are all isometric copies of $P_j$ , $1 \leq j \leq n$ .
With the notation now introduced, Bourgain’s Theorem can be restated as the assertion that $\mathbf {m}_{\mathbb {R}^d}\big ((t_j P)_{j \geq 1}\big ) = 0$ for all admissible $P \subset \mathbb {R}^d$ and all unbounded positive sequences $(t_j)_{j \geq 1}$ ; his proof, in fact, implies the stronger result that
whenever the dilation parameters $t_j$ grow without bound. Seen in this light, his results might inspire several further natural questions; for instance:
-
(Q1) What is the rate of decay of $\mathbf {m}_{\mathbb {R}^d}(t_1 P,\, t_2 P,\, \dots ,\, t_n P)$ with n as the ratios $t_{j+1}/t_j$ between consecutive scales get large?
-
(Q2) What possible values can be taken by the independence density $\mathbf {m}_{\mathbb {R}^d}(t_1 P,\, t_2 P$ , $\dots ,\, t_n P)$ of n distinct dilates of a given configuration P?
-
(Q3) Are there analogous results which are valid for other (non-Euclidean) spaces?
The goal of the present paper is to initiate the study of the independence density function $\mathbf {m}_{\mathbb {R}^d}$ and related geometrical parameters, and the investigation of these three problems will serve as the driving force behind our analysis.
1.1 Outline of the paper
In Section 2, we will formally define the independence density of a family of configurations, both in the entire space $\mathbb {R}^d$ and when restricted to bounded cubes in $\mathbb {R}^d$ , and start our study of this geometrical parameter. The methods we use are a mix of Fourier analysis, functional analysis and combinatorics. The Fourier-analytic part is based mainly on Bourgain’s arguments from [Reference Bourgain2], and the combinatorial part is based on Bukh’s arguments from [Reference Bukh3] (where he considered similar problems to ours but concerning forbidden distances). We do not assume that the reader is familiar with either of these papers; instead, we give a presentation of the relevant parts of their reasoning that will be important to us.
The main tools to be used in this section will be a Counting Lemma (Lemma 4) and a Supersaturation Theorem (Theorem 3), both of which are conceptually similar to results of the same name in graph and hypergraph theory (see [Reference Rödl and Schacht18, Reference Castro-Silva4, Reference Erdős and Simonovits10]). Intuitively, the Counting Lemma says that the count of admissible configurations inside a given set does not significantly change if we blur the set a little; this will be proven by Fourier-analytic methods. The Supersaturation Theorem states that any bounded set $A \subseteq [-R,\, R]^d$ , which is just slightly denser than the independence density of an admissible configuration P, must necessarily contain a positive proportion of all congruent copies of P lying in $[-R,\, R]^d$ ; this is proven by functional-analytic methods, via a compactness and weak $^*$ continuity argument.
We will then use these tools to obtain several results on the independence density parameter, and in particular, answer questions (Q1) and (Q2) in the case where the considered configuration P is admissible. Regarding question (Q1), we show that $\mathbf {m}_{\mathbb {R}^d}(t_1 P,\, t_2 P,\, \dots ,\, t_n P)$ tends to $\mathbf {m}_{\mathbb {R}^d}(P)^n$ as the ratios $t_{j+1}/t_j$ get large; this generalizes a theorem of Bukh from two-point configurations to k-point configurations with $k \leq d$ and easily implies Bourgain’s Theorem discussed in the Introduction. As for question (Q2), we show that, by forbidding n distinct dilates of such a configuration P, we can obtain as independence density any real number strictlyFootnote 4 between $\mathbf {m}_{\mathbb {R}^d}(P)^n$ and $\mathbf {m}_{\mathbb {R}^d}(P)$ , but none smaller than $\mathbf {m}_{\mathbb {R}^d}(P)^n$ or larger than $\mathbf {m}_{\mathbb {R}^d}(P)$ . We also prove:
-
- The general lower bound $\mathbf {m}_{\mathbb {R}^d}(P_1,\, P_2,\, \dots ,\, P_n) \geq \prod _{i=1}^n \mathbf {m}_{\mathbb {R}^d}(P_i)$ , which holds for all configurations $P_1, P_2, \dots , P_n \subset \mathbb {R}^d$ ;
-
- Continuity of the independence density function $\mathbf {m}_{\mathbb {R}^d}$ on the set of admissible configurations; and
-
- Existence of extremizer measurable sets (i.e., having maximal density) which avoid admissible configurations.
In Section 3, we will consider these same questions but related to the more complicated setting of sets on the unit sphere $\mathbb {S}^d$ . We will also present (and prove) a spherical analogue of Bourgain’s Theorem; this is in line with our question (Q3), as the sphere is the most well-studied non-Euclidean space.
Many of the arguments from the Euclidean setting will be used again in the spherical setting (in particular, the reliance on our two main combinatorial tools), but there are also some complications we need to solve that are intrinsic to the sphere. One of them is that harmonic analysis is (for our purposes) much more complicated on $\mathbb {S}^d$ than it is on $\mathbb {R}^d$ , which makes our proof of the spherical Counting Lemma correspondingly harder and more technical than its Euclidean counterpart. Moreover, due to the lack of dilation invariance in the spherical setting, we will only be able to make a modest progress towards answering its analogue of question (Q2) (and the answer to question (Q1) will be somewhat more intricate). The other results proven in the Euclidean space setting will continue to hold in the same form for sets on the sphere.
Finally, in Section 4, we discuss some related results in the literature and suggest several intriguing open problems in line with the results presented here.
1.2 Some remarks on notation
The same denomination will be used for both a set and its indicator function; for instance, if we are given $A \subseteq \mathbb {R}^d$ , then $A(x) = 1$ if $x \in A$ and $A(x) = 0$ otherwise. The group of permutations of $\{1, \dots , k\}$ is denoted by $\mathfrak {S}_k$ . Given a group G acting on some space X and an element x of this space, we write $\textrm{Stab}^G(x) := \{g \in G:\, g.x = x\}$ for the stabilizer subgroup of x.
The averaging notation $\mathbb {E}_{x \in X}$ is used to denote the expectation when the variable x is distributed uniformly over the set X. When X is (a subset of) a compact group G, this measure is (the restriction of) the normalized Haar measure on G, which is the unique Borel probability measure on G which is invariant by both left- and right-actions of this group. Similarly, we write $\mathbb {P}_{x \in X}$ to denote the probability under this same distribution.
2 Configurations in Euclidean space
Throughout this section, we shall fix an integer $d \geq 2$ and work on the d-dimensional Euclidean space $\mathbb {R}^d$ , equipped with its usual inner product $x \cdot y$ and associated Euclidean norm $\|x\|$ . We denote by $\textrm{vol}$ the Lebesgue measure on $\mathbb {R}^d$ and by $\mu $ the normalized Haar measure on the orthogonal group $\textrm{O}(\mathbb {R}^d) = \{O \in \mathbb {R}^{d \times d}:\, O^t O = I\}$ .
Given $x \in \mathbb {R}^d$ and $R> 0$ , we denote by $Q(x, R)$ the axis-parallel open cube of side length R centered at x. We write $d_{Q(x, R)}(A) := \textrm{vol}(A \cap Q(x, R))/R^d$ for the density of $A \subseteq \mathbb {R}^d$ inside the cube $Q(x, R)$ . The upper density of a measurable set $A \subseteq \mathbb {R}^d$ can then be written as $\overline {d}(A) = \limsup _{R \rightarrow \infty } d_{Q(0, R)}(A);$ if the limit exists, we shall instead denote it by $d(A)$ .
A configuration P is just a finite subset of $\mathbb {R}^d$ , and we define its diameter $\textrm{diam}\, P$ as the largest distance between two of its points. Recall that a configuration $P \subset \mathbb {R}^d$ on k points is said to be admissible if $k \leq d$ and if P is nondegenerate (that is, if it spans a $(k-1)$ -dimensional affine hyperplane). The space of k-point configurations can be given a metric induced from the Euclidean norm as follows: if $P = \{v_1, \dots , v_k\}$ and $Q = \{u_1, \dots , u_k\}$ , the distance between P and Q is
where the minimum is taken over all permutations $\sigma $ of $\{1, \dots , k\}$ . It is easy to see that, under the topology induced by this metric, the set of admissible configurations is an open set and that it is dense inside the family of all subsets of $\mathbb {R}^d$ with at most d elements.
We say that two configurations $P, Q \subset \mathbb {R}^d$ are congruent, and write $P \simeq Q$ , if they can be made equal using only rigid transformations; that is, $P \simeq Q$ if and only if there exist $x\in \mathbb {R}^d$ and $T \in \textrm{O}(\mathbb {R}^d)$ such that $P = x + T \cdot Q$ . Given a configuration $P \subset \mathbb {R}^d$ , we say that a set $A \subseteq \mathbb {R}^d$ avoids P if there is no subset of A which is congruent to P.
We can now formally define our main object of study in this section, the independence density of a configuration or family of configurations. There are, in fact, two closely related versions of this parameter we will need, depending on whether we are considering bounded or unbounded configuration-avoiding sets. Given $n \geq 1$ configuration $P_1, \dots , P_n \subset \mathbb {R}^d$ , we then define the quantities
These parameters are analogous to the notion of independence number of a hypergraph: if we consider the hypergraph on vertex set $\mathbb {R}^d$ (resp. $Q(0, R)$ ) whose edges are all isometric copies of $P_j$ , $1\leq j\leq n$ , then $\mathbf {m}_{\mathbb {R}^d}(P_1, \dots , P_n)$ (resp. $\mathbf {m}_{Q(0, R)}(P_1, \dots , P_n)$ ) can be thought of as the density of a largest independent set in this hypergraph.
Remark. For the sake of clarity and notational convenience, whenever possible the results we give about independence density will be stated and proved in the case of only one forbidden configuration. It can be easily verified that these results also hold in the case of several (but finitely many) forbidden configurations, with essentially unchanged proofs. Whenever we need this greater generality we will mention how the corresponding statement would be in the case of several configurations.
We start our investigations by proving a simple lemma which relates the two versions of independence density just defined.
Lemma 1. For all configurations $P \subset \mathbb {R}^d$ and all $R> 0$ , we have
Proof. For the first inequality, suppose $A \subseteq Q(0, R)$ is a set avoiding P and consider the periodic set $A' := A + (R + \textrm{diam}\, P) \mathbb {Z}^d$ . This set also avoids P, and it has density
Since we can choose $d_{Q(0, R)}(A)$ arbitrarily close to $\mathbf {m}_{Q(0, R)}(P)$ , the leftmost inequality follows.
Now, let $A \subseteq \mathbb {R}^d$ be any set avoiding P and note that $A \cap Q(x, R)$ also avoids P for every $x \in \mathbb {R}^d$ . By fixing $\varepsilon> 0$ and then averaging over all x inside a large enough cube $Q(0, R')$ (depending on A, $\textrm{diam}\, P$ and $\varepsilon $ ), we conclude there is $x \in \mathbb {R}^d$ for which $\textrm{vol}(A \cap Q(x, R))> (\overline {d}(A) - \varepsilon ) R^d$ . The rightmost inequality follows.
As we are interested in the study of sets avoiding certain configurations, it is useful to also have a way of counting how many such configurations there are in a given set. For a given configuration $P = \{v_1, v_2, \dots , v_k\} \subset \mathbb {R}^d$ and a measurable set $A \subseteq \mathbb {R}^d$ , we define
which represents how many (congruent) copies of P are contained in A. This quantity $I_P(A)$ can, of course, be infinite if the set A is unbounded, but we will use it almost exclusively for bounded sets. We can similarly define its weighted version
whenever $f: \mathbb {R}^d \rightarrow \mathbb {R}$ is a measurable function for which this integral makes sense (say, for $f \in L^k(\mathbb {R}^d)$ ). A large part of our analysis consists of getting a better understanding of the counting function $I_P$ .
When a measurable set $A \subseteq \mathbb {R}^d$ avoids some configuration P, it is clear from the definition that $I_P(A) = 0$ ; however, it is also possible for $I_P(A)$ to be zero even when A contains congruent copies of P. In intuitive terms, the condition $I_P(A) = 0$ means only that A contains a negligible fraction of all possible copies of P. The next result shows that this distinction is essentially irrelevant for most purposes.
Lemma 2 (Zero-measure removal).
Suppose $P \subset \mathbb {R}^d$ is a finite configuration and $A \subseteq \mathbb {R}^d$ is measurable. If $I_P(A) = 0$ , then we can remove a zero-measure subset of A in order to remove all copies of P.
Proof. By the Lebesgue Density Theorem, we have that
Now, we remove from A all points x for which this identity does not hold, thus obtaining a subset $B \subseteq A$ with $\textrm{vol}(A \setminus B) = 0$ and
We will show that no congruent copy of P remains on this restricted set B.
Suppose, for contradiction, that B contains a copy $\{u_1, \dots , u_k\}$ of P. By assumption, there exists some $\delta> 0$ such that
fix such a value of $\delta $ . Note that, if $d_{Q(x, \delta )}(B) \geq 1 - 1/(2^{d+1} k)$ for some $x \in \mathbb {R}^d$ , then for all $y \in Q(x, \delta /2)$ , we have
Our hypothesis (1) thus implies that $d_{Q(y, \delta /2)}(B) \geq 1 - 1/2k$ whenever $y \in Q(u_i, \delta /2)$ for some $1 \leq i \leq k$ .
Let $\ell := \max \{\|u_i\|:\, 1 \leq i \leq k\}$ be the largest length of a vector in our copy of P, and let us write $\mathcal {B}(I,\, \delta /(4\ell )) := \big \{ T \in \textrm{O}(\mathbb {R}^d):\, \|T - I\| \leq \delta /(4\ell ) \big \}$ for the ball of radius $\delta /(4\ell )$ in spectral norm centered on the identity I. Note that, whenever $T \in \mathcal {B}(I,\, \delta /(4\ell ))$ , we have that $T u_i \in Q(u_i, \delta /2)$ for each $1 \leq i \leq k$ . By the union bound, we then have
This immediately implies that
contradicting our assumption that $I_P(A) = 0$ and finishing the proof.
2.1 Fourier analysis on $\mathbb {R}^d$ and the Counting Lemma
We next show that the count of copies of an admissible configuration P inside a measurable set A does not significantly change if we ignore its fine details and ‘blur’ the set A a little. The philosophy is similar to the famous regularity method in graph theory, where a large graph can be replaced by a much smaller weighted ‘reduced graph’ (which is an averaged version of the original graph which ignores its fine details) without significantly changing the count of copies of any small subgraph.
The methods we will use are Fourier analytic in nature, drawing from Bourgain’s arguments presented in [Reference Bourgain2]. We define the Fourier transform on $\mathbb {R}^d$ by
for a (complex-valued) function $f\in L^1(\mathbb {R}^d)$ and a finite Borel measure $\sigma $ on $\mathbb {R}^d$ . The convolution between two functions f, $g \in L^1(\mathbb {R}^d)$ is defined by
We recall the basic identities $\widehat {f * g}(\xi ) = \widehat {f}(\xi ) \widehat {g}(\xi )$ and
as well as Parseval’s Identity $\|f\|_2 = \|\widehat {f}\|_2$ for $f\in L^1(\mathbb {R}^d) \cap L^2(\mathbb {R}^d)$ . For background in Fourier analysis, we refer the reader to the classic textbook of Stein and Weiss [Reference Stein and Weiss20].
Denote $\mathcal {Q}_{\delta }(x) := \delta ^{-d} Q(0, \delta )(x)$ . This way, $f * \mathcal {Q}_{\delta }(x) = \delta ^{-d} \int _{Q(x, \delta )} f(y) \,dy$ is the average of a function f on the cube $Q(x, \delta )$ . Specializing to the indicator function of a set $A \subseteq \mathbb {R}^d$ , we obtain $A * \mathcal {Q}_{\delta }(x) = d_{Q(x, \delta )}(A)$ ; this represents a ‘blurring’ of the set A considered (see Figure 1). What we wish to obtain is then an upper bound on the difference $|I_P(A) - I_P(A * \mathcal {Q}_{\delta })|$ which goes to zero as $\delta $ goes to zero uniformly over all measurable sets $A \subseteq Q(0, R)$ (for any fixed $R> 0$ ).
Before delving into the details of our argument, let us present a simple telescoping sum argument which will be needed here and will be reused several times in this paper. Suppose we wish to bound from above the expression
for some given functions f, g and some configuration $P = \{v_1, \dots , v_k\}$ . Since we can rewrite the term inside the parenthesis above as the telescoping sum
it follows from the triangle inequality that $|I_P(f) - I_P(g)|$ is at most
To obtain some bound for $|I_P(f) - I_P(g)|,$ it then suffices to obtain a similar bound for an expression of the form
whenever each $h_i$ is either f or g, and whenever $(u_1, \dots , u_k)$ is a permutation of the points of P.
We shall refer to an argument of this form (breaking a difference of products into a telescoping sum, using the triangle inequality and bounding each term of the resulting expression) as the telescoping sum trick. It is frequently used in modern graph and hypergraph theory when estimating the number of subgraphs inside a given large (hyper)graph G with the aid of edge-discrepancy measures such as the cut norm; such results are usually known as counting lemmas and are an essential part of the regularity method we have already mentioned (see the surveys [Reference Rödl and Schacht18, Reference Castro-Silva4] for details).
In our arguments, we will also need some analytic facts and estimates, which we now provide. Given an m-dimensional subspace $U\subseteq \mathbb {R}^d$ , we denote by $\sigma _U^{(m-1)}$ the uniform probability measure on its unit sphere $\mathbb {S}_U^{m-1} := \big \{x\in U:\, \|x\| = 1\big \}$ . This measure is closely related to the Haar measure $\mu _U$ on the orthogonal group $\textrm{O}(U)$ : if $X \subseteq \mathbb {S}_U^{m-1}$ is a measurable set and $x\in \mathbb {S}_U^{m-1}$ is any point, then
(see, for instance, [Reference Dai and Xu5, Appendix A.5] for a simple proof of this fact). Given $T\in \textrm{O}(\mathbb {R}^d)$ , we write $TU := \{Tu:\, u \in U\}$ for the rotated subspace.
Lemma 3. There are constants $C_1, C_2> 0$ (depending on the dimension d) such that
and, if V is an m-dimensional subspace of $\mathbb {R}^d$ ,
Proof. For the first inequality, note that
(where the j-th term in the product is $1$ if $\xi _j = 0$ ). It follows from the Taylor expansion of $\sin (\cdot )$ that $|x - \sin (x)| \leq C |x|^3$ for some $C>0$ and all $x\in [-1, 1]$ . As the sine function is bounded, we conclude there is some constant $C_1> 0$ (depending on d) for which
holds for all $\delta>0$ , $\xi \in \mathbb {R}^d$ .
For the second inequality, we use the estimate
where $\pi _U \xi $ is the orthogonal projection of $\xi $ onto U and K is an absolute constant. This estimate follows from
and the well-known asymptotic bound $|\widehat {\sigma }_{\mathbb {R}^m}^{(m-1)}(\xi )| = O(\|\xi \|^{-(m-1)/2})$ for the unit sphere on $\mathbb {R}^m$ (see, for instance, Chapter VIII, Section 3 in Stein’s book [Reference Stein19]). For any $\xi \in \mathbb {R}^d\setminus \{0\}$ , we then have that
where we performed the change of variables $y = T^{-1} \xi /\|\xi \|$ .
It now suffices to show that the last integral above is finite, which we will do by induction on $d \geq m$ . In the base case where $d=m$ , the integral is clearly equal to $1$ since the projection operator $\pi _{\mathbb {R}^m}$ is the identity. If $d \geq m+1$ , parameterize $\mathbb {S}^{d-1}$ by
denoting by $\omega _{d-1}$ (resp. $\omega _{d-2}$ ) the total Lebesgue measure of the unit sphere of $\mathbb {R}^d$ (resp. $\mathbb {R}^{d-1}$ ), this change of variables gives
We then obtain
and the desired bound follows by induction.
We are now ready to formally state and prove our main technical tool in the Euclidean setting, which by analogy with methods from graph theory we shall call the Counting Lemma. We note that the main steps of its proof were already present in Bourgain’s paper [Reference Bourgain2].
Lemma 4 (Counting Lemma).
For every admissible configuration $P \subset \mathbb {R}^d$ , there exists a constant $C_P> 0$ such that the following holds: for every $R> 0$ and any measurable set $A \subseteq Q(0, R)$ , we have that
Moreover, the same constant $C_P$ can be made to hold uniformly over all configurations $P'$ inside a neighborhood of P.
Proof. Let $(v_1, \dots , v_k)$ be a fixed permutation of the points of P. We will work a bit more generally and show that a bound as in the statement of the lemma holds for
whenever $f_1,\, \dots ,\, f_k:\, Q(0, R) \rightarrow [-1, 1]$ are measurable functions. By our telescoping sum trick, this immediately implies the result.
We first exploit the translation invariance of the problem in order to simplify the argument later on. Let $U \subset \mathbb {R}^d$ denote the $(k-2)$ -dimensional affine hyperplane spanned by $v_1, \dots , v_{k-1}$ , and let $\pi _U v_k$ be the orthogonal projection of $v_k$ onto U (so $\pi _U v_k$ is the point in U which is closest to $v_k$ ). By translating all points in P by $-\pi _U v_k$ , we may assume that U contains the origin (being thus a subspace of $\mathbb {R}^d$ ) and that $v_k$ belongs to its orthogonal complement $U^{\perp }$ . Note that $v_k \neq 0$ since the points in P are affinely independent, and $U^{\perp }$ has dimension $d-k+2 \geq 2$ ; these are the two properties we will need in the proof which require the assumption that P is admissible.
Let $H := \textrm{Stab}^{\textrm{O}(\mathbb {R}^d)}(U)$ denote the subgroup of orthogonal transformations which act trivially on the subspace U and let $\nu _H$ be the Haar measure on H. Let $G := f_k - f_k * \mathcal {Q}_{\delta }$ and, for a given $T\in \textrm{O}(\mathbb {R}^d)$ , define the function $F_T:\, Q(0, R) \rightarrow [-1, 1]$ by
The integrand on the expression we wish to bound can then be written more succinctly as $F_T(x) G(x + T v_k)$ . By symmetry of the Haar measure $\mu $ , we conclude that
where we have used that $F_{TS} = F_T$ for all $S\in H$ , since by definition, $Sv_i = v_i$ for all $1\leq i\leq k-1$ . Using this identity, we conclude that the expression we wish to bound is at most
Now we concentrate on the expression inside the parenthesis in (2) for some fixed $T\in \textrm{O}(\mathbb {R}^d)$ . We claim that, when S is distributed according to the Haar measure on H, the variable $y := TS(v_k/\|v_k\|)$ is uniformly distributed on the unit sphere of the subspace $TU^{\perp }$ . This follows from the fact that $v_k/\|v_k\|$ is on the unit sphere of $U^{\perp }$ , and $H := \textrm{Stab}^{\textrm{O}(\mathbb {R}^d)}(U)$ is isomorphicFootnote 5 to the orthogonal group of $U^{\perp }$ . Denoting by $\sigma ^{(d-k+1)}_{TU^{\perp }}$ the normalized surface measure on the unit sphere of $TU^{\perp }$ , we can then write the expression inside the parenthesis in (2) as
Integrating over $T\in \textrm{O}(\mathbb {R}^d)$ and applying Cauchy-Schwarz to the inner integral, we conclude that (2) is at most
where we have used Parseval’s Identity and the convolution identity.
Since $|F_T(x)| \leq |f_1(x + Tv_1)|$ pointwise, it follows that $\|F_T\| \leq \|f_1\|_2$ for all $T \in \textrm{O}(\mathbb {R}^d)$ . Using this inequality and applying Cauchy-Schwarz to the outer integral, we see that the expression above is at most
Finally, the double integral in (3) can be bounded using the Fourier estimates given in Lemma 3, as we now show. Divide the integral over $\mathbb {R}^d$ into two parts, corresponding to the bounded region where $\|\xi \| \leq (\delta \|v_k\|)^{-1/2}$ and the unbounded region where $\|\xi \|> (\delta \|v_k\|)^{-1/2}$ . For the bounded region, we note that $|\widehat {\sigma }^{(d-k+1)}_{TU^{\perp }}(\xi )| \leq 1$ for all $T\in \textrm{O}(\mathbb {R}^d)$ , $\xi \in \mathbb {R}^d$ and use the first inequality in Lemma 3 to obtain
For the unbounded region, we use the simple estimate $|\widehat {\mathcal {Q}}_{\delta }(\xi )| \leq \|\mathcal {Q}_{\delta }\|_1 = 1$ and the second inequality in Lemma 3 to conclude that
Summing the bounds obtained for both regions shows that, for $d \geq k$ and $0 < \delta \leq 1$ , we can bound expression (3) by
and the inequality in the statement of the lemma follows. Since this last bound depends continuously on the positioning of the points of P (which gives the value of $\|v_k\|$ ), the claim that the obtained constant $C_P$ can be made uniform inside a neighborhood of P also follows.
We remark that the proof above is the only place (in the Euclidean setting) where we make explicit use of the assumption that a configuration is admissible. However, as the Counting Lemma will be an essential ingredient of several later results, this requirement will be inherited by them as well.
2.2 Continuity properties of the counting function
Given some configuration P on the space $\mathbb {R}^d$ , it is sometimes important to understand how much the count of congruent copies of P on a set $A \subseteq \mathbb {R}^d$ can change if we perturb the set A a little. An instance of this problem was already considered in the Counting Lemma, where the perturbation was given by blurring, and it was seen that the counting function $I_P$ is somewhat robust to small perturbations (in the case of admissible configurations).
Using our telescoping sum trick, it is easy to show that $I_P$ is also robust to small perturbations measured by the $L^{\infty }$ norm; more precisely, $I_P$ is continuous on $L^{\infty }(Q(0, R))$ for any fixed $R> 0$ . When P is admissible, we obtain the following significantly stronger continuity property.
Lemma 5 (Weak $^*$ continuity).
If $P \subset \mathbb {R}^{d}$ is an admissible configuration, then for every fixed $R> 0$ , the function $I_P$ is weak $^*$ continuous on the unit ball of $L^{\infty }(Q(0, R))$ .
Proof. Denote the closed unit ball of $L^{\infty }(Q(0, R))$ by $\mathcal {B}_{\infty }$ . Since $\mathcal {B}_{\infty }$ endowed with the weak $^*$ topology is metrizable (see e.g., [Reference Megginson15, Corollary 2.6.20]), it suffices to prove that $I_P$ is sequentially continuous (i.e., that $I_P(f_i) \xrightarrow {i \rightarrow \infty } I_P(f)$ whenever $f_i \xrightarrow {i \rightarrow \infty } f$ ).
Suppose then $(f_i)_{i \geq 1} \subset \mathcal {B}_{\infty }$ is a sequence weak $^*$ converging to $f \in \mathcal {B}_{\infty }$ . It follows that, for every $x \in Q(0, R)$ and every $\delta> 0$ , we have
Since $f * \mathcal {Q}_{\delta }$ and each $f_i * \mathcal {Q}_{\delta }$ are Lipschitz with the same constant (depending only on $\delta $ , as $\|f\|_{\infty }$ , $\|f_i\|_{\infty } \leq 1$ ) and $Q(0, R)$ is bounded, this easily implies that
In particular, it follows that $\lim _{i \rightarrow \infty } I_P(f_i * \mathcal {Q}_{\delta }) = I_P(f * \mathcal {Q}_{\delta })$ .
Since P is admissible, by the Counting Lemma (Lemma 4), we have
Choosing $i_0(\delta ) \geq 1$ sufficiently large so that
we conclude that
Since $\delta> 0$ is arbitrary, this implies that $\lim _{i \rightarrow \infty } I_P(f_i) = I_P(f)$ , as wished.
We will also need an equicontinuity property for the family of counting functions $P \mapsto I_P(A)$ over all bounded measurable sets $A \subseteq \mathbb {R}^d$ . In what follows, we shall write $\mathcal {B}(P, r) \subset (\mathbb {R}^d)^k$ for the ball of radius r centered on $P = \{v_1, \dots , v_k\}$ , where we recall that the distance from P to $Q = \{u_1, \dots , u_k\}$ is given by
Lemma 6 (Equicontinuity).
For every admissible $P \subset \mathbb {R}^d$ and every $\varepsilon> 0$ , there is $\delta> 0$ such that the following holds: if $P' \in \mathcal {B}(P, \delta )$ , then for all $R \geq 1$ , we have
Proof. We will use the fact that the constant $C_P$ promised in the Counting Lemma can be made uniform inside a small neighborhood of P; more precisely, there is $r> 0$ and a constant $\tilde {C}_P> 0$ such that
holds for all $P' \in \mathcal {B}(P, r)$ , $R> 0$ and (measurable) $A \subseteq Q(0, R)$ .
Fix constants $R \geq 1$ and $\delta , \rho \in (0, 1]$ with $\delta < \rho $ . For any set $A \subseteq Q(0, R)$ and any points $x,\, y \in Q(0, R)$ with $\|x - y\| \leq \delta $ , we have that
Noting that $A * \mathcal {Q}_{\rho }$ is supported on $Q(0, R+\rho ) \subseteq Q(0, 2R)$ , we conclude from our telescoping sum trick that
whenever $\|P' - P\|_{\infty } \leq \delta $ .
Now take $\rho \in (0, 1]$ small enough so that $\tilde {C}_P \rho ^{1/4} \leq \varepsilon /4$ , and for this value of $\rho $ , take $0 < \delta < \min \{r,\, \rho \}$ small enough so that
Then, for any configuration $P' \in \mathcal {B}(P,\, \delta )$ and any set $A \subseteq Q(0, R)$ , we obtain
as desired.
2.3 The Supersaturation Theorem
Now we wish to show that geometrical hypergraphs encoding copies of some admissible configuration P have a nice supersaturation property: if a set $A \subseteq \mathbb {R}^d$ is just slightly denser than the independence density of P, then it must contain a positive proportion of all congruent copies of P. This result is quite similar, both formally and in spirit, to an important combinatorial theorem of Erdős and Simonovits [Reference Erdős and Simonovits10] in the setting of forbidden graphs and hypergraphs.
Remark. The insight that supersaturation results can be used to better study extremal geometrical problems of the kind we are interested in is due to Bukh [Reference Bukh3]. He introduced the notion of a ‘supersaturable property’ as any characteristic of measurable sets which satisfies several conditions meant to enable the proof of a supersaturation result; the prototypical and most important example of supersaturable property given in Bukh’s paper is that of avoiding a finite collection of distances. Here, we will obtain similar results in the case of avoiding general admissible configurations, but our method of proof is more analytical in nature and quite different from his.
Using our zero-measure removal lemma (Lemma 2), we can immediately obtain a weak supersaturation property which holds for any $R> 0$ and any configuration $P \subset \mathbb {R}^d$ :
-
(WS) If $d_{Q(0, R)}(A)> \mathbf {m}_{Q(0, R)}(P)$ , then $I_P(A)> 0$ .
For our purposes, however, we will need to strengthen this simple property in two ways: first to obtain a uniform lower bound on $I_P(A)$ which depends only on R and the slack $d_{Q(0, R)}(A) - \mathbf {m}_{Q(0, R)}(P)$ , but not on the specific set $A \subseteq Q(0, R)$ ; and then to make the proportion $I_P(A \cap Q(0, R))/R^d$ of copies of P uniform also on the size R of the cube considered.
The first strengthening can be obtained from (WS) by a compactness argument, using the fact that the counting function of admissible configurations is weak $^*$ continuous.
Lemma 7 (Weak supersaturation).
Let $P \subset \mathbb {R}^d$ be an admissible configuration. For every $R> 0$ and $\varepsilon> 0$ , there exists $c_0> 0$ such that the following holds: whenever $A \subseteq Q(0, R)$ satisfies $d_{Q(0, R)}(A) \geq \mathbf {m}_{Q(0, R)}(P) + \varepsilon $ , we have $I_P(A) \geq c_0$ .
Proof. Suppose, for contradiction, that the result is false. Then, there exist $\varepsilon> 0$ , $R> 0$ and a sequence $(A_i)_{i \geq 1}$ of subsets of $Q(0, R)$ , each of density at least $\mathbf {m}_{Q(0, R)}(P) + \varepsilon $ , which satisfy $\lim _{i \rightarrow \infty } I_P(A_i) = 0$ .
The unit ball $\mathcal {B}_{\infty }$ of $L^{\infty }(Q(0, R))$ is weak $^*$ compact by the Banach-Alaoglu Theorem, and it is also metrizable in this topology (see [Reference Megginson15, Chapter 2.6]). By possibly restricting to a subsequence, we may then assume that $(A_i)_{i \geq 1}$ converges in the weak $^*$ topology of $L^{\infty }(Q(0, R))$ ; let us denote its limit by $A \in \mathcal {B}_{\infty }$ . It is clear that $0 \leq A \leq 1$ almost everywhere, and
By weak $^*$ continuity of $I_P$ (Lemma 5), we also have $I_P(A) = \lim _{i \rightarrow \infty } I_P(A_i) = 0$ .
Now, let $B := \big \{x \in Q(0, R):\, A(x) \geq \varepsilon \big \}$ . Since
we conclude that $I_P(B) \leq \varepsilon ^{-k} I_P(A) = 0$ and
But this set B contradicts (WS) (or Lemma 2), finishing the proof.
Our desired supersaturation result now follows from a simple averaging argument.
Theorem 3 (Supersaturation Theorem).
Let $P \subset \mathbb {R}^d$ be an admissible configuration and let $\varepsilon> 0$ . There exist constants $c> 0$ and $R_0> 0$ such that the following holds for all $R \geq R_0$ : if $A \subseteq Q(0, R)$ satisfies
then $I_P(A) \geq c R^d$ .
Proof. Take $R_1> 0$ large enough so that $\mathbf {m}_{Q(0, R_1)}(P) \leq \mathbf {m}_{\mathbb {R}^d}(P) + \varepsilon /4$ (see Lemma 1). We will show that the conclusion of the theorem holds for $R_0 = 4 d R_1/\varepsilon $ and some constant $c> 0$ to be chosen later.
Suppose $R \geq 4 d R_1/\varepsilon $ , and let $A \subseteq Q(0, R)$ be a set of density
Since $\mathbf {m}_{Q(0, R_1)}(P) \leq \mathbf {m}_{\mathbb {R}^d}(P) + \varepsilon /4 \leq \mathbf {m}_{Q(0, R)}(P) + \varepsilon /4,$ we have that
Let $K := \lfloor R/R_1 \rfloor $ , and note that
By our assumption on R, we conclude that $K^d R_1^d \geq (1 - \varepsilon /4) R^d$ , and thus
Partitioning the cube $Q(0, K R_1)$ into $K^d$ cubes of side length $R_1$ , by averaging, we conclude that at least $\varepsilon K^d/4$ of these cubes $Q(x, R_1)$ satisfy
By Lemma 7, there is some $c_0> 0$ (depending on $R_1$ and $\varepsilon $ but not on A or R) such that $I_P(A \cap Q(x, R_1)) \geq c_0$ holds for each one of the cubes in the partition satisfying (4); summing up all these values, we conclude that
finishing the proof for $c = \varepsilon c_0/(2^{d+2} R_1^d)$ .
Remark. The arguments used in the proofs of Lemma 7 and Theorem 3 easily extend to the case of several configurations $P_1, \dots , P_n \subset \mathbb {R}^d$ , with only minor and notational modifications. In the case of the Supersaturation Theorem, one concludes that $I_{P_i}(A) \geq c(\varepsilon ) R^d$ holds for some $1 \leq i \leq n$ whenever the density condition $d_{Q(0, R)}(A) \geq \mathbf {m}_{Q(0, R)}(P_1,\, \dots ,\, P_n) + \varepsilon $ is satisfied (assuming R is large enough and all the configurations $P_i$ are admissible).
Following Bukh [Reference Bukh3], for each $\delta> 0$ and $\gamma> 0$ , we define the zooming-out operator $\mathcal {Z}_{\delta }(\gamma )$ as the map which takes a measurable set $A \subseteq \mathbb {R}^d$ to the set
Intuitively, $\mathcal {Z}_{\delta }(\gamma )[A]$ represents the points where A is not too sparse at scale $\delta $ .
Using the Supersaturation Theorem together with the Counting Lemma, we can now show that the existence of copies of P in a set A follows also from the weaker assumption that its zoomed-out version $\mathcal {Z}_{\delta }(\gamma )[A]$ has density higher than $\mathbf {m}_{\mathbb {R}^d}(P)$ (rather than A itself having this same density). This property will be important for us later on.
Corollary 1. Given an admissible configuration $P \subset \mathbb {R}^d$ and $\varepsilon> 0$ , there exists $\delta _0> 0$ such that the following holds for all $\delta \leq \delta _0$ : if $A \subseteq \mathbb {R}^d$ satisfies
then A contains a congruent copy of P.
Proof. Let $R_0$ , $c> 0$ be the constants promised in the Supersaturation Theorem applied to P and with $\varepsilon $ substituted by $\varepsilon /3$ . Up to substituting $R_0$ by some larger constant, we may also assume that $\mathbf {m}_{Q(0, R)}(P) \leq \mathbf {m}_{\mathbb {R}^d}(P) + \varepsilon /3$ for all $R \geq R_0$ (see Lemma 1).
Suppose $A \subseteq \mathbb {R}^d$ satisfies $\overline {d} \big (\mathcal {Z}_{\delta }(\varepsilon )[A]\big ) \geq \mathbf {m}_{\mathbb {R}^d}(P) + \varepsilon $ for some $0 < \delta \leq 1$ . Since
there must exist some $R \geq R_0$ such that
Denoting $A' := A\cap Q(0,R)$ , we may then assume that $A' \subseteq Q(0, R)$ satisfies
for some $R \geq R_0$ , and wish to show that $A'$ (and hence A) contains a copy of P if $\delta> 0$ is small enough depending on P and $\varepsilon $ .
By the Supersaturation Theorem, inequality (5) implies that $I_P \big (\mathcal {Z}_{\delta }(\varepsilon )[A']\big ) \geq c R^d$ . Since $A' * \mathcal {Q}_{\delta }(x) \geq \varepsilon \cdot \mathcal {Z}_{\delta }(\varepsilon )[A'](x)$ for all $x \in \mathbb {R}^d$ , we obtain from the Counting Lemma that
Taking $\delta> 0$ small enough for this last expression to be positive, we conclude that $I_P(A')> 0$ , and so $A'$ contains a copy of P as wished.
2.4 Results on the independence density
We are finally in a position to properly study the independence density parameter for a family of configurations in Euclidean space.
We start by proving a simple lower bound on the independence density of several distinct configurations; this result and the argument we use to prove it are originally due to Bukh [Reference Bukh3].
Lemma 8 (Supermultiplicativity).
For all $n \geq 1$ and all configurations $P_1, \dots , P_n \subset \mathbb {R}^d$ , we have that
Proof. Fix $\varepsilon> 0$ and choose R large enough so that
For each $1 \leq i \leq n$ , let $A_i \subseteq Q(0, R - \textrm{diam}\, P_i)$ be a set which avoids $P_i$ and satisfies $d_{Q(0, R - \textrm{diam}\, P_i)}(A_i)> \mathbf {m}_{\mathbb {R}^d}(P_i) - \varepsilon $ (this is possible by Lemma 1). We then construct the R-periodic set $A_i' := A_i + R \mathbb {Z}^d$ , which also avoids $P_i$ and has density
Since each set $A_i'$ is periodic with the same fundamental domain $Q(0, R)$ , it follows that the average of $d\big (\bigcap _{i=1}^n (x_i + A_i')\big )$ over independent translates $x_1, \dots , x_n \in Q(0, R)$ is equal to $\prod _{i=1}^n d(A_i')$ . There must then exist some $x_1, \dots , x_n \in Q(0, R)$ for which
Since $\bigcap _{i=1}^n (x_i + A_i')$ avoids each of the configurations $P_i$ and $\varepsilon> 0$ was arbitrary, the desired lower bound follows.
Intuitively, one may regard $\mathbf {m}_{\mathbb {R}^d}(P_1,\, P_2,\, \dots ,\, P_n)$ being close to $\prod _{i=1}^n \mathbf {m}_{\mathbb {R}^d}(P_i)$ as some sort of independence or lack of correlation between the n constraints of forbidding each configuration $P_i$ ; in this case, there is no better way to choose a set avoiding all of these configurations than simply intersecting optimal $P_i$ -avoiding sets for each i (after suitably translating them). One might then expect this to happen if the sizes of each $P_i$ are very different from each other, so that each constraint will be relevant in different and largely independent scales.
Our next result shows this is indeed the case whenever the configurations considered are all admissible. (A theorem of Graham [Reference Graham13] implies this is not necessarily true if the configurations considered are non-admissible; see Section 4 for a discussion.) The proof we present here is based on Bukh’s arguments for supersaturable properties and generalizes his result from two-point configurations to general admissible configurations.
Theorem 4 (Asymptotic independence).
If $P_1, P_2, \dots , P_n \subset \mathbb {R}^d$ are admissible configurations, then
as the ratios $t_2/t_1,\, t_3/t_2,\, \dots ,\, t_n/t_{n-1}$ tend to infinity.
Proof. We have already seen that
always holds, so it suffices to show that $\mathbf {m}_{\mathbb {R}^d}(t_1 P_1,\, t_2 P_2,\, \dots ,\, t_n P_n)$ is no larger than $\prod _{i=1}^n \mathbf {m}_{\mathbb {R}^d}(P_i) + \varepsilon $ whenever $\varepsilon> 0$ and the ratios between consecutive scales $t_i$ are large enough. We shall proceed by induction, with the case $n = 1$ being trivial.
Let $n \geq 2$ and suppose the theorem holds for configurations $P_1, \dots , P_{n-1}$ . Fix $0 < \varepsilon \leq 1$ and let $t_1,\, \dots ,\, t_{n-1}> 0$ be dilation parameters for which
now, take $R_0> 0$ large enough so that
holds for all $R \geq R_0$ (this quantity exists by Lemma 1).
If $A \subseteq \mathbb {R}^d$ is a measurable set avoiding $t_1 P_1,\, \dots ,\, t_{n-1} P_{n-1}$ , then clearly
Moreover, if A also avoids $t_n P_n$ for some $t_n> 0$ , then $A/t_n$ avoids $P_n$ and so by Corollary 1, there is some $\delta _0> 0$ (depending only on $P_n$ and $\varepsilon $ ) for which
Suppose now that $t_n \geq R_0/\delta _0$ and let $A \subseteq \mathbb {R}^d$ be any measurable set avoiding $t_1 P_1,\, \dots ,\, t_n P_n$ . We conclude from (6) that
holds for all $x \in \mathbb {R}^d$ , and from (7), we have
This means that the density of A inside cubes $Q(x, t_n \delta _0)$ of side length $t_n \delta _0$ is at most $\varepsilon $ (when $x \notin \mathcal {Z}_{t_n \delta _0}(\varepsilon )[A]$ ) except at a set of upper density at most $\mathbf {m}_{\mathbb {R}^d}(P_n) + \varepsilon $ , when it is instead no more than $\prod _{i=1}^{n-1} \mathbf {m}_{\mathbb {R}^d}(P_i) + 2\varepsilon $ . Taking averages, we conclude that
(where we used that $0 < \varepsilon \leq 1$ ). This inequality finishes the proof.
As an immediate corollary of the last theorem, we conclude that
as $t_2/t_1,\, t_3/t_2,\, \dots ,\, t_n/t_{n-1} \rightarrow \infty $ whenever $P \subset \mathbb {R}^d$ is admissible; this answers our question (Q1) in the case of admissible configurations. Let us now show how this result easily implies Bourgain’s Theorem given in the Introduction.
Proof of Theorem 2.
Suppose $A \subset \mathbb {R}^d$ is a measurable set not satisfying the conclusion of the theorem; thus, there is a sequence $(t_j)_{j \geq 1}$ tending to infinity such that A does not contain a copy of any $t_j P$ . This implies that $\overline {d}(A) \leq \mathbf {m}_{\mathbb {R}^d}(t_1 P,\, t_2 P, \dots ,\, t_n P)$ for all $n \in \mathbb {N}$ . By taking a suitably fast-growing subsequence, we may then use Theorem 4 to obtain (say) $\overline {d}(A) \leq 2 \mathbf {m}_{\mathbb {R}^d}(P)^n$ for any fixed $n \geq 1$ . Since $\mathbf {m}_{\mathbb {R}^d}(P) < 1$ ,Footnote 6 this implies that $\overline {d}(A) = 0$ , as wished.
Going back to our study of the independence density for multiple configurations, we will now consider the opposite situation of what we have seen before: when the constraints of forbidding each individual configuration are so strongly correlated as to be essentially redundant. One might expect this to be the case, for instance, when we are forbidding very close dilates of a given configuration P.
We will show that this intuition is indeed correct, whether or not the configuration considered is admissible, and the proof is much simpler than in the case of very distant dilates of P (in particular, not needing the results from earlier sections).
Lemma 9 (Asymptotic redundancy).
For any configuration $P \subset \mathbb {R}^d$ , we have that
as $t_2/t_1,\, t_3/t_2,\, \dots ,\, t_n/t_{n-1} \rightarrow 1$ .
Proof. Assume, by dilation invariance, that $t_1 = 1$ , and note that it suffices to show the convergence above with $\mathbf {m}_{\mathbb {R}^d}$ replaced by $\mathbf {m}_{Q(0, R)}$ for every fixed $R> 0$ . We will then fix an arbitrary $R> 0$ and prove that $\mathbf {m}_{Q(0, R)}(P,\, t_2 P,\, \dots ,\, t_n P) \rightarrow \mathbf {m}_{Q(0, R)}(P)$ as $t_2,\, t_3,\, \dots ,\, t_n \rightarrow 1$ .
Let $(v_1, v_2, \dots , v_k)$ be an ordering of the points of P, and consider the continuous function $g_P:\, (\mathbb {R}^d)^k \times \textrm{O}(\mathbb {R}^d) \rightarrow \mathbb {R}$ given by
Note that $\min _{T \in \textrm{O}(\mathbb {R}^d)} g_P(x_1,\, \dots ,\, x_k,\, T) = 0$ if and only if $(x_1, \dots , x_k)$ is congruent to $(v_1, \dots , v_k)$ .
Fix some $\varepsilon> 0$ and let $A \subset Q(0, R)$ be a measurable set which avoids P and has density $d_{Q(0, R)}(A) \geq \mathbf {m}_{Q(0, R)}(P) - \varepsilon $ . By inner regularity, we know there exists a compact set $\widetilde {A} \subseteq A$ with $d_{Q(0, R)}(\widetilde {A}) \geq \mathbf {m}_{Q(0, R)}(P) - 2\varepsilon $ . Denote by $\gamma $ the minimum of the continuous function $g_P$ on the compact set $\widetilde {A}^k \times \textrm{O}(\mathbb {R}^d)$ ; since $\widetilde {A}$ avoids P, it follows that $\gamma> 0$ .
We will now prove that $\widetilde {A}$ also avoids $t P$ whenever t is sufficiently close to 1, say when $|t-1| < \gamma /(k \cdot \textrm{diam}\, P)$ . Indeed, for all $x_1, \dots , x_k \in \widetilde {A}$ and all $T \in \textrm{O}(\mathbb {R}^d)$ , by the triangle inequality, we have that
which is positive if $|t-1| < \gamma /(k \cdot \textrm{diam}\, P)$ . In particular, we see that
whenever $|t_j - 1| < \gamma /(k \cdot \textrm{diam}\, P)$ for $2 \leq j \leq n$ . Since we clearly have that $\mathbf {m}_{Q(0, R)}(P,\, t_2 P,\, \dots ,\, t_n P) \leq \mathbf {m}_{Q(0, R)}(P)$ , the result follows.
The proof of this last result actually implies a somewhat stronger and more technical property of the independence density, namely that every configuration P where $\mathbf {m}_{\mathbb {R}^d}$ is discontinuous must be a local minimum across the ‘discontinuity barrier’; more formally, we have that $\inf _{P'\in \mathcal {B}(P, \delta )} \mathbf {m}_{\mathbb {R}^d}(P') \rightarrow \mathbf {m}_{\mathbb {R}^d}(P)$ as $\delta \rightarrow 0$ , where $\mathcal {B}(P, \delta )$ is the ball of radius $\delta $ centered on P (the details of the proof are given below). If the configuration P is admissible, then we can also prove the corresponding limit for $\sup _{P'\in \mathcal {B}(P, \delta )} \mathbf {m}_{\mathbb {R}^d}(P')$ and conclude that $\mathbf {m}_{\mathbb {R}^d}$ is, in fact, continuous at this point. This is done in the next theorem:
Theorem 5 (Continuity of the independence density).
For every $n \geq 1$ , the function $(P_1, \dots , P_n) \mapsto \mathbf {m}_{\mathbb {R}^d}(P_1, \dots , P_n)$ is continuous on the set of n admissible configurations in $\mathbb {R}^d$ .
Proof. For the sake of better readability, we will prove the result in the case of only one forbidden configuration; the n-variable version easily follows from the same argument. Fix some $\varepsilon> 0$ and let $R_1 \geq 1$ be large enough so that
holds for all $P' \in \mathcal {B}(P, 1)$ and all $R \geq R_1$ (this value exists by Lemma 1).
Let $R \geq R_1$ and let $A \subset Q(0, R)$ be a compact P-avoiding set with density
Proceeding exactly as we did in the proof of the last lemma, we conclude that A also avoids all $P'$ close enough to P; for all such configurations, we then have
Since $\mathbf {m}_{Q(0, R)}(P') \leq \mathbf {m}_{\mathbb {R}^d}(P') + \varepsilon $ whenever $P' \in \mathcal {B}(P, 1)$ , this implies that $\mathbf {m}_{\mathbb {R}^d}(P') \geq \mathbf {m}_{\mathbb {R}^d}(P) - 2\varepsilon $ for all $P'$ close enough to P.
Now, we suppose that P is admissible, and let $R_0$ , $c> 0$ be the constants promised by the Supersaturation Theorem (Theorem 3). Let $R \geq \max \{R_0, R_1\}$ . By equicontinuity (Lemma 6), there is some $\delta> 0$ for which the inequality
holds whenever $A \subset Q(0, R)$ is a measurable set; fix such a value of $\delta $ .
If $P' \in \mathcal {B}(P, \delta )$ and $A \subset Q(0, R)$ is a measurable set avoiding $P'$ , we conclude from inequality (8) that $I_P(A) < c R^d$ . By the Supersaturation Theorem, this implies that $d_{Q(0, R)}(A) < \mathbf {m}_{Q(0, R)}(P) + \varepsilon $ , and thus (by optimizing over A), we conclude that $\mathbf {m}_{Q(0, R)}(P') \leq \mathbf {m}_{Q(0, R)}(P) + \varepsilon $ . It follows that
whenever $P' \in \mathcal {B}(P, \delta )$ , finishing the proof.
These last results can now be combined in a very simple way to give an (almost complete) answer to question (Q2), when restricted to admissible configurations. Let us denote by $\mathcal {M}_n(P)$ the set of all possible independence densities one can obtain by forbidding n distinct dilates of a configuration P; that is,
Recall that (Q2) asked for an explicit description of this set $\mathcal {M}_n(P)$ .
Theorem 6 (Forbidding multiple dilates).
If $P \subset \mathbb {R}^d$ is admissible, then
Proof. It is clear that $\mathbf {m}_{\mathbb {R}^d}(t_1 P,\, t_2 P,\, \dots ,\, t_n P) \leq \mathbf {m}_{\mathbb {R}^d}(t_1 P) = \mathbf {m}_{\mathbb {R}^d}(P)$ always holds, and we saw in Lemma 8 that
Moreover, Lemma 9 implies that $\mathbf {m}_{\mathbb {R}^d}(P)$ is an accumulation point of the set $\mathcal {M}_n(P)$ , and (since P is admissible) Theorem 4 implies the same about $\mathbf {m}_{\mathbb {R}^d}(P)^n$ . The result follows from continuity of the function
which is an immediate consequence of Theorem 5.
As our final result in the Euclidean setting, we will show the existence of extremizer sets which avoid admissible configurations. This generalizes a result of Bukh (see Corollary 13 in [Reference Bukh3]) from forbidden distances to higher-order configurations.
Theorem 7 (Existence of extremizers).
If $P \subset \mathbb {R}^{d}$ is admissible, then there exists a P-avoiding measurable set $A \subseteq \mathbb {R}^d$ with density $d(A) = \mathbf {m}_{\mathbb {R}^d}(P)$ .
Proof. For each integer $i \geq 1$ , let $A_i \subseteq Q(0, i)$ be a P-avoiding set with density $d_{Q(0, i)}(A_i) \geq \mathbf {m}_{Q(0, i)}(P) - 2^{-i}$ . Denote the unit ball of $L^{\infty }(\mathbb {R}^d)$ by $\mathcal {B}_{\infty }$ ; by the Banach-Alaoglu Theorem, $\mathcal {B}_{\infty }$ is weak $^*$ compact. By restricting to a subsequence if necessary, we may then assume that $(A_i)_{i \geq 1}$ converges to some element $\widetilde {A} \in \mathcal {B}_{\infty }$ in the weak $^*$ topology of $L^{\infty }(\mathbb {R}^d)$ . Denote by $A := \textrm{supp}\, \widetilde {A}$ the support of $\widetilde {A}$ .Footnote 7
We will first prove that $I_P(A) = 0$ . Fix some $R> 0$ and (for notational convenience) denote the indicator function of the cube $Q(0, R)$ by $\chi _R$ . Writing $\chi _R A_i$ for the pointwise product of $\chi _R$ and the indicator function of $A_i$ , one easily sees from the definition that $(\chi _R A_i)_{i \geq 1}$ converges to $\chi _R \widetilde {A}$ in the weak $^*$ topology of $L^{\infty }(Q(0, R))$ . As P is admissible, the counting function $I_P$ is weak $^*$ continuous (by Lemma 5) and thus, $I_P(\chi _R \widetilde {A}) = \lim _{i \rightarrow \infty } I_P(\chi _R A_i) = 0$ . We now proceed as in the proof of Lemma 7 to show that $I_P(\textrm{supp}\, \chi _R \widetilde {A}) = 0$ as well: first, approximate $I_P(\textrm{supp}\, \chi _R \widetilde {A})$ by $I_P(B_{\varepsilon })$ , where $B_{\varepsilon } := \{x \in Q(0, R):\, \widetilde {A}(x) \geq \varepsilon \}$ and $\varepsilon> 0$ is a sufficiently small constant (depending on R), and then note that $I_P(B_{\varepsilon }) \leq \varepsilon ^{-|P|} I_P(\chi _R \widetilde {A}) = 0$ for all $\varepsilon> 0$ . Since $\textrm{supp}\, \chi _R \widetilde {A} = A \cap Q(0, R)$ up to zero-measure sets and $R> 0$ is arbitrary, we conclude that $I_P(A) = 0$ as wished.
Next we prove that $d(A) = \mathbf {m}_{\mathbb {R}^d}(P)$ . Since $I_P(A) = 0$ , it follows from Lemma 2 that $\overline {d}(A) \leq \mathbf {m}_{\mathbb {R}^d}(P)$ , and so it suffices to show that
Fix some arbitrary $\varepsilon> 0$ and take $R_0 \geq 2$ large enough so that $(R_0 + 2\textrm{diam}\, P)^d < (1 + \varepsilon /4) R_0^d$ . For any given $R \geq R_0$ , take a P-avoiding set $B_R \subseteq Q(0, R)$ with
For all $i \geq R$ , define $A_i' := B_R \cup (A_i \setminus Q(0, R + 2\textrm{diam}\, P))$ ; note that $A_i'$ avoids P and
Since $\textrm{vol}(A_i') \leq \mathbf {m}_{Q(0, i)}(P)\, i^d$ for all $i \geq R$ and $\int _{Q(0, R)} \big (\widetilde {A}(x) - A_i(x)\big ) \,dx> -\varepsilon $ for all sufficiently large i, we conclude that for large enough i, we have
proving inequality (9).
Finally, since $I_P(A) = 0$ , it follows from Lemma 2 that we can remove a zero-measure subset of A in order to remove all copies of P without changing its density. The theorem follows.
3 Configurations on the sphere
In this section, we turn to the question of whether the methods and results shown in the Euclidean space setting can also be made to work in the spherical setting.
We shall fix an integer $d \geq 2$ throughout this section and work on the d-dimensional unit sphere $\mathbb {S}^d := \big \{x\in \mathbb {R}^{d+1}:\, \|x\|=1\big \}$ . We denote the uniform probability measure on $\mathbb {S}^d$ by $\sigma ^{(d)} = \sigma $ , and the normalized Haar measure on $\textrm{O}(\mathbb {R}^{d+1})$ by $\mu _{d+1} = \mu $ . These two measures are related as follows: if $X \subseteq \mathbb {S}^d$ is a measurable set and $x\in \mathbb {S}^d$ , then
The analogue of the axis-parallel cube in the spherical setting will be the spherical cap: given $x \in \mathbb {S}^d$ and $\rho> 0$ , we denoteFootnote 8
We say $\textrm{Cap}(x, \rho )$ is the spherical cap with center x and radius $\rho $ . Since its measure $\sigma (\textrm{Cap}(x, \rho ))$ does not depend on the center point x, we shall denote this value simply by $\sigma (\textrm{Cap}_{\rho })$ . For a given (measurable) set $A \subseteq \mathbb {S}^d$ , we then write
for the density of A inside this cap.
We define a (spherical) configuration on $\mathbb {S}^d$ as a finite subset of $\mathbb {R}^{d+1}$ which is congruent to a set on $\mathbb {S}^d$ ; it is convenient to allow for configurations that are not necessarily on the sphere in order to consider dilations. Note that, if $P, Q \subset \mathbb {S}^d$ are two configurations which are on the sphere, then $P \simeq Q$ if and only if there is a transformation $T \in \textrm{O}(\mathbb {R}^{d+1})$ for which $P = T \cdot Q$ (translations are no longer necessary in this case).
A spherical configuration P on $\mathbb {S}^d$ is said to be admissible if it has at most d points and if it is congruent to a set $P' \subset \mathbb {S}^d$ which is linearly independent.Footnote 9 As before, we shall say that some set $A \subseteq \mathbb {S}^d$ avoids P if there is no subset of A which is congruent to P.
The natural analogues of the independence density in the spherical setting can now be given. For $n \geq 1$ configurations $P_1, \dots , P_n$ on $\mathbb {S}^d$ , we define the quantities
Whenever convenient, we will state and prove results in the case of only one forbidden configuration, as the more general case of multiple forbidden configurations follows from the same arguments with only trivial modifications (but heavier notation).
The first issue we encounter in the spherical setting is that it is not compatible with dilations: given a set of points $P \subset \mathbb {S}^d$ and some dilation parameter $t> 0$ , it is usually not true that there exists a set $Q \subset \mathbb {S}^d$ congruent to $t P$ . However, there is a large class of configurations (including the ones we call admissible) for which this is true whenever $0 < t \leq 1$ ; we shall say that they are contractible.
It is easy to show that any configuration $P \subset \mathbb {S}^d$ which is contained in a d-dimensional affine hyperplane (e.g., any configuration with at most $d+1$ points) is contractible. Indeed, let $0 < t\leq 1$ and suppose $P \subset \mathbb {S}^d \cap (w+U)$ , where $U \subset \mathbb {R}^{d+1}$ is a d-dimensional subspace and w is orthogonal to U. Then, w is orthogonal to $v-w$ for every $v\in P$ , and one readily checks thatFootnote 10 $sw + tP \subset \mathbb {S}^d$ for $s = \big (t^2 + (1-t^2) \|w\|^{-2}\big )^{1/2} - t$ .
Even when the configuration we are considering is contractible, however, there is no easy relationship between the independence densities of its distinct dilates. We will then start with the following reassuring lemma, which in a sense assures us the results we will eventually obtain are not true for only trivial reasons.
Lemma 10. For any fixed contractible configuration $P \subset \mathbb {S}^d$ , we have that
Proof. For the first inequality, we note that spherical caps are exactly the closed balls of the separable metric space $\mathbb {S}^d$ endowed with the Euclidean distance. This allows us to use the Vitali Covering Lemma; see, for instance, [Reference Mattila14, Theorem 2.1]. For any given $0 < t \leq 1$ , start with the trivial cover $\mathbb {S}^d = \bigcup _{x\in \mathbb {S}^d} \textrm{Cap}(x,\, \textrm{diam}\, tP)$ and apply the Vitali Covering Lemma to obtain a (necessarily finite) set of center points $\{x_1, \dots , x_N\} \subset \mathbb {S}^d$ such that $\textrm{Cap}(x_i, \textrm{diam}\, tP) \cap \textrm{Cap}(x_j, \textrm{diam}\, tP) = \emptyset $ for $i \neq j$ and
Since the caps $\textrm{Cap}(x_i, \textrm{diam}\, tP)$ are pairwise disjoint, it is easy to see that the set
does not contain any copy of $tP$ . Finally, as the inequality $\sigma (\textrm{Cap}_{\rho }) \geq c_d \sigma (\textrm{Cap}_{20\rho })$ holds for some constant $c_d> 0$ and all $0 < \rho \leq 2$ , denoting $\rho (t) := (\textrm{diam}\, tP)/4$ , we have that
and thus, $\mathbf {m}_{\mathbb {S}^d}(tP) \geq \sigma (A_t) \geq c_d$ for all $0 < t \leq 1$ .
For the second inequality, suppose $A \subseteq \mathbb {S}^d$ avoids $P = \{v_1, \dots , v_k\}$ . Then,
Integrating over $\textrm{O}(\mathbb {R}^{d+1})$ , we obtain
implying that $\sigma (A) \leq 1 - 1/k$ . Thus, $\sup _{0 < t \leq 1} \mathbf {m}_{\mathbb {S}^d}(tP) \leq 1 - 1/|P|$ .
Given some configuration $P = \{v_1, v_2, \dots , v_k\} \subset \mathbb {S}^d$ , we define the counting function $I_P$ which acts on a bounded measurable function $f: \mathbb {S}^d \rightarrow \mathbb {R}$ by
In the case where f is the indicator function of a set $A \subseteq \mathbb {S}^d$ , we note that
If the spherical configuration P is not a subset of the sphere, we define the function $I_P$ as being equal to $I_Q$ for any $Q \simeq P$ which is contained in $\mathbb {S}^d$ .
As in the Euclidean setting, one can show there is no meaningful difference between requiring that a measurable set $A \subseteq \mathbb {S}^d$ avoids some configuration P or that it only satisfies $I_P(A) = 0$ . This is proven in the next lemma.
Lemma 11 (Zero-measure removal).
Suppose $P \subset \mathbb {S}^d$ is a finite configuration and $A \subseteq \mathbb {S}^d$ is measurable. If $I_P(A) = 0$ , then we can remove a zero-measure subset of A in order to remove all copies of P.
Proof. It will be more convenient to change spaces and work on the orthogonal group $\textrm{O}(\mathbb {R}^{d+1})$ rather than on the sphere $\mathbb {S}^d$ . For $\delta> 0$ and $R \in \textrm{O}(\mathbb {R}^{d+1})$ , denote by
the ball of radius $\delta $ in spectral norm centered on R and let I denote the identity transformation. We will first show that
Let $e\in \mathbb {S}^d$ be an arbitrary point and define on $\textrm{O}(\mathbb {R}^{d+1})$ the (measurable) set $E := \{R \in \textrm{O}(\mathbb {R}^{d+1}):\, Re \in A\}$ . By the Lebesgue Density Theorem on $\textrm{O}(\mathbb {R}^{d+1})$ , we have that
But this means exactly that the measure of the set
of nondensity points is zero. It is clear from the definition of F that it is invariant under the right-action of $\textrm{Stab}^{\textrm{O}(\mathbb {R}^{d+1})}(e)$ ; this implies $\sigma (\{Re:\, R \in F\}) = \mu (F) = 0$ , proving (10).
Now we remove from A all points x for which identity (10) does not hold, thus obtaining a subset $B \subseteq A$ with $\sigma (A \setminus B) = 0$ and
We will show that no copy of P remains on this restricted set B, which will finish the proof of the lemma.
Suppose, for contradiction, that B contains a copy $\{u_1, \dots , u_k\}$ of P. Then, there exists $\delta> 0$ for which
which means that $\mathbb {P}_{T \in \mathcal {B}(I, \delta )}(T u_i \notin B) \leq 1/2k$ for each $1 \leq i \leq k$ . Thus,
contradicting our assumption that $I_P(A) = 0$ .
3.1 Harmonic analysis on the sphere and the Counting Lemma
The next thing we need is an analogue of the Counting Lemma in the spherical setting, saying we do not significantly change the count of configurations in a given set $A \subseteq \mathbb {S}^d$ by blurring this set a little. As in the Euclidean setting, we will use Fourier-analytic methods to prove such a result; we now give a quick overview of the definitions and results on harmonic analysis we will need for our arguments.
Given an integer $n \geq 0$ , we write $\mathscr {H}^{d+1}_n$ for the space of real harmonic polynomials, homogeneous of degree n, on $\mathbb {R}^{d+1}$ . That is,
The restriction of the elements of $\mathscr {H}^{d+1}_n$ to $\mathbb {S}^d$ are called spherical harmonics of degree n on $\mathbb {S}^d$ . If $Y \in \mathscr {H}^{d+1}_n$ , note that $Y(x) = \|x\|^n Y(x')$ where $x = \|x\| x'$ and $x' \in \mathbb {S}^d$ ; we can then identify $\mathscr {H}^{d+1}_n$ with the space of spherical harmonics of degree n, which by a slight (and common) abuse of notation, we also denote $\mathscr {H}^{d+1}_n$ .
Harmonic polynomials of different degrees are orthogonal with respect to the standard inner product $\langle f,\, g \rangle _{\mathbb {S}^d} := \int _{\mathbb {S}^d} f(x) g(x) \,d\sigma (x)$ . Moreover, it is a well-known fact (see e.g., [Reference Dai and Xu5, Chapter 1.1]) that the family of spherical harmonics is dense in $L^2(\mathbb {S}^d)$ , and so
Denoting by $\mathrm{{proj}}_{n}: L^2(\mathbb {S}^d) \rightarrow \mathscr {H}^{d+1}_n$ the orthogonal projection onto $\mathscr {H}^{d+1}_n$ , what this means is that $f = \sum _{n=0}^{\infty } \mathrm{{proj}}_n f$ for all $f \in L^2(\mathbb {S}^d)$ (with equality in the $L^2$ sense). By orthogonality, we obtain Parseval’s identity:
There is a family $(P_n^d)_{n \geq 0}$ of polynomials on $[-1, 1]$ , usually called ultraspherical or Gegenbauer polynomials, which is associated to this decomposition. We use the convention that $\deg P_n^d = n$ and $P_n^d(1) = 1$ . These polynomials can be defined via the addition formula
where $\{Y_i:\, 1\leq i\leq \dim \mathscr {H}_n^{d+1}\}$ is an (arbitrary) orthonormal basis of $\mathscr {H}_n^{d+1}$ . We refer the reader to Chapter 1.2 of Dai and Xu’s book [Reference Dai and Xu5] for the proof that this formula is independent of the choice of basis, and that it indeed defines a polynomial on $[-1, 1]$ .
The next theorem collects several properties of the Gegenbauer polynomials which will be useful for us.
Theorem 8. For all integers $d \geq 2$ and $n \geq 0$ , the following hold:
-
(i) $P^d_n(t) \in [-1, 1]$ for all $t \in [-1, 1]$ .
-
(ii) The projection operator $\mathrm{{proj}}_n: L^2(\mathbb {S}^d) \rightarrow \mathscr {H}^{d+1}_n$ is given by
(12) $$ \begin{align} \mathrm{{proj}}_n f(x) = \dim \mathscr{H}_n^{d+1} \int_{\mathbb{S}^d} P_n^d(x \cdot y) f(y) \,d\sigma(y). \end{align} $$ -
(iii) For each fixed $y, z \in \mathbb {S}^d$ , we have
(13) $$ \begin{align} \int_{\mathbb{S}^d} P_n^d(x \cdot y) P_n^d(x \cdot z) \,d\sigma(x) = \frac{1}{\dim \mathscr{H}_n^{d+1}} P_n^d(y \cdot z). \end{align} $$ -
(iv) For any fixed $\gamma> 0$ , $\max _{t \in [-1+\gamma ,\, 1-\gamma ]} P^d_n(t)$ tends to zero as $n \rightarrow \infty $ .
Proof. The first three items follow easily from the addition formula (11). Indeed, fix some orthonormal basis $\{Y_i:\, 1\leq i\leq \dim \mathscr {H}_n^{d+1}\}$ of $\mathscr {H}_n^{d+1}$ . Then,
which by the triangle inequality followed by Cauchy-Schwarz is at most
proving $(i)$ . Item $(ii)$ follows from the chain of equalities
To prove item $(iii)$ , note that
which equals the right-hand side of (13) by definition.
Finally, the last item immediately follows from the more precise asymptotic bound given in [Reference Szegő21, Theorem 8.21.6].
We will follow Dunkl [Reference Dunkl8] in defining both the convolution operation on the sphere and the spherical analogue of Fourier coefficients. For this, we will need to break the symmetry of the sphere a little and distinguish an (arbitrary) point e on $\mathbb {S}^d$ ; we think of this point as being the north pole. Write $\mathcal {M}(\mathbb {S}^d; e)$ for the space of Borel regular zonal measures on $\mathbb {S}^d$ with pole at e, that is, those measures which are invariant under the action of $\textrm{Stab}^{\textrm{O}(\mathbb {R}^{d+1})}(e)$ . We will refer to the elements of $\mathcal {M}(\mathbb {S}^d; e)$ simply as zonal measures.
Given a function $f \in L^2(\mathbb {S}^d)$ and a zonal measure $\nu \in \mathcal {M}(\mathbb {S}^d; e)$ , we define their convolution $f * \nu $ by
where $T_x \in \textrm{O}(\mathbb {R}^{d+1})$ is an arbitrary element satisfying $T_x e = x$ . It is easy to see that this operation is well-defined, independently of the choice of $T_x$ : if $S_x e = T_x e = x$ , then $S_x^{-1} T_x \in \textrm{Stab}(e)$ and so $\nu (S_x^{-1} A) = \nu ((S_x^{-1} T_x) T_x^{-1} A) = \nu (T_x^{-1} A)$ . The value $f * \nu (x)$ can be thought of as the average of f according to a measure which acts with respect to x as $\nu $ acts with respect to the north pole e.
For an integer $n \geq 0$ and a zonal measure $\nu \in \mathcal {M}(\mathbb {S}^d; e)$ , we define its n-th Fourier coefficient $\widehat {\nu }_n$ by
The main property we will need of Fourier coefficients is the following result, which is stated in Dunkl’s paper [Reference Dunkl8] and can be proven using a straightforward modification of the methods exposed in Chapter $2$ of Dai and Xu’s book [Reference Dai and Xu5].
Theorem 9. If $f \in L^2(\mathbb {S}^d)$ and $\nu \in \mathcal {M}(\mathbb {S}^d; e)$ , then $f * \nu \in L^2(\mathbb {S}^d)$ and
With this, we finish our review of harmonic analysis on the sphere, so let us return to our specific problem. For a given $\delta> 0$ , denote by $\textrm{Cap}_{\delta }$ the uniform probability measure on the spherical cap $\textrm{Cap}(e, \delta )$ :
Note that each $\textrm{Cap}_{\delta }$ is a zonal measure. One immediately checks that
for all $n \geq 0$ , and
for all $f \in L^2(\mathbb {S}^d)$ . In particular, if $A \subseteq \mathbb {S}^d$ is a measurable set, then $A * \textrm{Cap}_{\delta }(x) = d_{\textrm{Cap}(x, \delta )}(A)$ ; this gives the ‘blurring’ of the spherical sets we shall consider.
Lemma 12. For every $d \geq 2$ and $\gamma> 0$ , there exists a function $c_{d, \gamma }: (0, 1] \rightarrow \mathbb {R}$ with $\lim _{\delta \rightarrow 0^+} c_{d, \gamma }(\delta ) = 0$ such that the following holds: for all $f, g \in L^2(\mathbb {S}^d)$ and all points $u, v \in \mathbb {S}^d$ with $|u \cdot v| \leq 1 - \gamma $ , we have that
Proof. Denote by $\nu _e$ the Haar measure on $\textrm{Stab}(e)$ , and assume without loss of generality that u coincides with the north pole e. By symmetry, the expression we wish to bound may then be written as
where $h = g - g * \textrm{Cap}_{\delta }$ .
Write $t_0 := e \cdot v$ . Note that when $S \in \textrm{Stab}(e)$ is distributed uniformly according to $\nu _e$ , the point $Sv$ is uniformly distributed on $\mathbb {S}^{d-1}_{t_0} := \{ y \in \mathbb {S}^d: e \cdot y = t_0 \}$ . Denote by $\sigma _{t_0}^{(d-1)}$ the uniform probability measure on $\mathbb {S}^{d-1}_{t_0}$ (that is, the unique one which is invariant under the action of $\textrm{Stab}(e)$ ). Making the change of variables $y = Sv$ , we see that
The expression we wish to bound is then equal to
Using Parseval’s Identity, we can rewrite the right-hand side of the last equality as
where we used Theorem 9 and then Cauchy-Schwarz. As $h = g - g * \textrm{Cap}_{\delta }$ , the expression above is equal to
Fix some $\varepsilon> 0$ . Since $t_0 \in [-1 + \gamma ,\, 1 - \gamma ]$ (by hypothesis), from Theorem 8, we obtain that $|P^d_n(t_0)| \leq \varepsilon /2$ holds for all $n \geq N(\varepsilon , \gamma )$ , while
always holds. Moreover, since each $P^d_n$ is a polynomial satisfying $P^d_n(1) = 1$ , we can choose $\delta _0 = \delta _0(\varepsilon , \gamma )> 0$ small enough so that $|1 - P^d_n(e \cdot y)| \leq \varepsilon $ holds whenever $n < N(\varepsilon , \gamma )$ and $y \in \textrm{Cap}(e, \delta _0)$ . This implies that the last sum is at most
whenever $\delta \leq \delta _0(\varepsilon , \gamma )$ , finishing the proof.
Recall that a spherical configuration P is admissible if it has at most d points and if it is congruent to a set $P' \subset \mathbb {S}^d$ which is linearly independent. We can now give the spherical counterpart to the Counting Lemma from the last section.
Lemma 13 (Counting Lemma).
For every admissible configuration P on $\mathbb {S}^d$ , there exists a function $\eta _P: (0, 1] \rightarrow (0, 1]$ with $\lim _{\delta \rightarrow 0^+} \eta _P(\delta ) = 0$ such that the following holds for all measurable sets $A \subseteq \mathbb {S}^d$ :
Moreover, this upper-bound function $\eta _P$ can be made to hold uniformly over all configurations $P'$ inside a neighborhood of P.
Proof. Up to congruence, we may assume $P \subset \mathbb {S}^d$ . Similarly to what we did in the Euclidean setting, we will first obtain a uniform upper bound for
valid whenever $0 \leq f_1, \dots , f_k \leq 1$ are measurable functions and $(v_1, v_2, \dots , v_k)$ is a permutation of the points of P.
Denote by $G := \textrm{Stab}^{\textrm{O}(\mathbb {R}^{d+1})}(v_1, \dots , v_{k-2})$ the stabilizer of the first $k-2$ points of P, and by $H := \textrm{Stab}^{\textrm{O}(\mathbb {R}^{d+1})}(v_1, \dots , v_{k-2}, v_{k-1}) = \textrm{Stab}^{G}(v_{k-1})$ the stabilizer of the first $k-1$ points of P. We can then bound the expression above by
where $\mu _G$ denotes the normalized Haar measure on G.
Denote $\ell := d-k+2 \geq 2$ . Since P is nondegenerate, we see that $G \simeq \textrm{O}(\mathbb {R}^{\ell +1})$ and that both $Gv_{k-1}$ and $Gv_k$ are spheres of dimension $\ell $ . Morally, we should then be able to apply the last lemma (with $d = \ell $ , $f = f_{k-1}(T\cdot )$ and $g = f_k(T\cdot )$ ) and easily conclude. However, the convolution in expression (15) above happens in $\mathbb {S}^d$ , while that on the last lemma would happen in $\mathbb {S}^{\ell }$ ; in particular, if $k \geq 3$ so that $\ell < d$ , all of the mass on the average defined by the convolution in (15) lies outside of the $\ell $ -dimensional sphere $Gv_k$ , so this argument cannot work. We will have to work harder to conclude.
Note that since $Gv_k$ is an $\ell $ -dimensional sphere while $Hv_k$ is an $(\ell -1)$ -dimensional sphere (which happens because P is nondegenerate), it follows that there is a point $\xi \in Gv_k$ which is fixed by H; this point will work as the north pole of $Gv_k$ .
It will be more convenient to work on the canonical unit sphere $\mathbb {S}^{\ell }$ instead of the $\ell $ -dimensional sphere $Gv_k \subset \mathbb {S}^d$ . We shall then restrict ourselves to the $(\ell +1)$ -dimensional affine hyperplane $\mathcal {H}$ determined by $\mathcal {H} \cap \mathbb {S}^d = Gv_k$ , and place coordinates on it to identify $\mathcal {H}$ with $\mathbb {R}^{\ell +1}$ and $Gv_k$ with $\mathbb {S}^{\ell }$ , noting that G then acts as $\textrm{O}(\mathbb {R}^{\ell +1})$ . More formally, let $r> 0$ be the radius of $Gv_k$ in $\mathbb {R}^{d+1}$ so that $Gv_k$ is isometric to $r \mathbb {S}^{\ell }$ ; take such an isometry $\psi : Gv_k \rightarrow r \mathbb {S}^{\ell }$ , and define $e \in \mathbb {S}^{\ell }$ by $e := \psi (\xi )/r$ . Now, we construct a map $\phi : G \rightarrow \textrm{O}(\mathbb {R}^{\ell +1})$ defined by
for each $S \in G$ . It is easy to check that this map is well-defined and gives an isomorphism between G and $\textrm{O}(\mathbb {R}^{\ell +1})$ , satisfying $\phi (H) = \textrm{Stab}^{\textrm{O}(\mathbb {R}^{\ell +1})}(e)$ .
For each fixed $T \in \textrm{O}(\mathbb {R}^{d+1})$ , define the functions $g_T, h_T: \mathbb {S}^{\ell } \rightarrow [-1, 1]$ by
for all $R \in \textrm{O}(\mathbb {R}^{\ell +1})$ . These functions are indeed well-defined on $\mathbb {S}^{\ell }$ since $\textrm{Stab}^G(v_{k-1}) = \textrm{Stab}^G(\xi ) = \phi ^{-1}(\textrm{Stab}^{\textrm{O}(\mathbb {R}^{\ell +1})}(e))$ . Note that $h_T$ can also be written as a function of $x \in \mathbb {S}^{\ell }$ by making use of the isometry $\psi ^{-1}:\, r \mathbb {S}^{\ell }\rightarrow Gv_k$ :
Denote by $u := \psi (v_k)/r$ the point in $\mathbb {S}^{\ell }$ corresponding to $v_k$ . Making the change of variables $R = \phi (S)$ , we obtain
where we write $\textrm{Stab}(e)$ for $\textrm{Stab}^{\textrm{O}(\mathbb {R}^{\ell +1})}(e)$ and $\nu _e$ for its Haar measure. Working as we did to obtain equation (14), we see that the expression in parentheses is equal to $h_T * \sigma _{e \cdot u}^{(\ell -1)}(Re)$ , where $\sigma _{e \cdot u}^{(\ell -1)}$ is the uniform probability measure on the $(\ell -1)$ -sphere $\textrm{Stab}(e) u = \{y \in \mathbb {S}^{\ell }: e \cdot y = e \cdot u\}$ (and the convolution now takes place in $\mathbb {S}^{\ell }$ with e as the north pole). Making the change of variables $x = Re$ , we then see that the expression above is equal to
We conclude that the expression (15) we wish to bound is equal to
where we applied Cauchy-Schwarz twice.
Let us now compute $e \cdot u$ , which will be necessary for bounding $\|h_T * \sigma _{e \cdot u}^{(\ell -1)}\|_2^2$ . From the identity
we conclude that $r^2(2 - 2\, e \cdot u) = 2 - 2\, \xi \cdot v_k$ , and so
depends only on the ordering $(v_1, \dots , v_k)$ of P and not on our later choices (note that this value depends continuously on the points $v_1, \dots , v_k$ and so is bounded away from $\{-1,\, 1\}$ uniformly over all configurations $P'$ close enough to P).
Now fix an arbitrary $\varepsilon> 0$ . By Parseval’s Identity and Theorem 9, we have that
Since $e \cdot u \notin \{-1,\, 1\}$ is a constant depending only on P, by Theorem 8, there exists $N = N(\varepsilon , P) \in \mathbb {N}$ such that $|P^{\ell }_n(e \cdot u)| \leq \varepsilon $ for all $n> N$ (by that same theorem, this value of N can be made robust to small perturbations of the value $e \cdot u$ , which corresponds to small perturbations of the configuration P). Using that $|P^{\ell }_n(t)| \leq 1$ for all $-1 \leq t \leq 1$ , we conclude that
The second term on the right-hand side of the inequality above is upper bounded by $\varepsilon ^2 \|h_T\|_2^2 \leq \varepsilon ^2$ , so let us concentrate on the first term.
By identities (12) and (13), we have
Since $|P_n^{\ell }(y \cdot z)| \leq 1$ for all $y, z \in \mathbb {S}^{\ell }$ , we conclude that
We now divide this last double integral on the sphere into two parts, depending on whether or not $y \cdot z$ is close to the extremal points $1$ or $-1$ . Thus, for some parameter $0 < \gamma < 1$ to be chosen later, we write the double integral as
Since $-1 \leq h_T \leq 1$ , the first term is at most
To bound the second term, note that for fixed $y, z \in \mathbb {S}^{\ell }$ , we have
where $\widetilde {y} := \psi ^{-1}(ry)$ and $\widetilde {z} := \psi ^{-1}(rz)$ . Moreover, we have
thus, whenever $|y \cdot z| \leq 1-\gamma $ , we have $|\widetilde {y} \cdot \widetilde {z}| \leq 1-r^2\gamma $ . Using Lemma 12 (with $f = f_k - f_k * \textrm{Cap}_{\delta }$ , $g = f_k$ and $\gamma $ substituted by $r^2\gamma $ ), we conclude that the second term is bounded by $c_{d, r^2\gamma }(\delta )$ .
Taking stock of everything, we obtain
for any $0 < \gamma < 1$ . Choosing $\gamma $ small enough depending on $\ell $ , $\varepsilon $ and N, and then choosing $\delta $ small enough depending on d, $r^2\gamma $ , $\varepsilon $ and N (so ultimately only on $\varepsilon $ and P), we can bound the right-hand side above by $4\varepsilon ^2$ ; the expression (15) is then bounded by $2 \varepsilon $ in this case.
For such small values of $\delta $ , we thus conclude from our telescoping sum trick (explained in Section 2.1) that $|I_P(A) - I_P(A * \textrm{Cap}_{\delta })| \leq 2k\varepsilon $ , proving the desired inequality since $\varepsilon> 0$ is arbitrary. The claim that the upper bound can be made uniform inside some neighborhood of P follows from analyzing our proof.
We remark that the proof of the Counting Lemma given above is the only place where we explicitly make use of the assumption that a spherical configuration is admissible. This assumption, however, will get inherited by all later results which make use of the Counting Lemma in their proofs.
3.2 Continuity properties of the counting function
Following the same script as in the Euclidean setting, we now consider other ways in which the counting function is robust to small perturbations.
It is again easy to show, using our telescoping sum trick, that $I_P$ is continuous in $L^{\infty }(\mathbb {S}^d)$ (and even in $L^{|P|}(\mathbb {S}^d)$ ) for all spherical configurations. When the configuration considered is admissible, we obtain also the following significantly stronger continuity property of $I_P$ when restricting to bounded functions.
Lemma 14 (Weak $^*$ continuity).
If P is an admissible configuration on $\mathbb {S}^d$ , then $I_P$ is weak $^*$ continuous on the unit ball of $L^{\infty }(\mathbb {S}^d)$ .
Proof. Denote the closed unit ball of $L^{\infty }(\mathbb {S}^d)$ by $\mathcal {B}_{\infty }$ , and let $(f_i)_{i \geq 1} \subset \mathcal {B}_{\infty }$ be a sequence weak $^*$ converging to $f \in \mathcal {B}_{\infty }$ . It will suffice to show that $\left (I_P(f_i)\right )_{i \geq 1}$ converges to $I_P(f)$ .
Note that, for every $x \in \mathbb {S}^d$ , $\delta> 0$ , we have
Since $f * \textrm{Cap}_{\delta }$ and each $f_i * \textrm{Cap}_{\delta }$ are Lipschitz with the same constant (depending only on $\delta $ ) and $\mathbb {S}^d$ is compact, this easily implies that
In particular, we conclude $\lim _{i \rightarrow \infty } I_P(f_i * \textrm{Cap}_{\delta }) = I_P(f * \textrm{Cap}_{\delta })$ .
Since P is admissible, by the spherical Counting Lemma, we have
Choosing $i_0(\delta ) \geq 1$ sufficiently large so that
we conclude that
Since $\delta> 0$ is arbitrary and $\eta _P(\delta ) \rightarrow 0$ as $\delta \rightarrow 0$ , this finishes the proof.
Given some spherical configuration $P = \{v_1, \dots , v_k\} \subset \mathbb {R}^{d+1}$ , let us write $\mathcal {B}(P, r) \subset (\mathbb {R}^{d+1})^k$ for the ball of radius r centered on P, where the distance from P to $Q = \{u_1, \dots , u_k\}$ is given by
If P is an admissible spherical configuration, note that all configurations inside a small enough ball centered on P will also be admissible.
We will later need an equicontinuity property for the family of counting functions $P \mapsto I_P(A)$ over all measurable sets $A \subseteq \mathbb {S}^d$ ; this is given in the following lemma.
Lemma 15 (Equicontinuity).
For every admissible $P \subset \mathbb {S}^d$ and every $\varepsilon> 0$ , there exists $\delta> 0$ such that
Proof. We will use the fact that the function $\eta _P$ obtained in the Counting Lemma can be made uniform inside a small ball centered on P. In other words, there is $r> 0$ and a function $\eta _P': (0, 1] \rightarrow (0, 1]$ with $\lim _{t \rightarrow 0} \eta _P'(t) = 0$ such that
Now, for a given $\rho> 0$ and all $0 < \delta < \rho $ , we see from the triangle inequality that
and so $\sigma \big (\textrm{Cap}(x,\, \rho ) \setminus \textrm{Cap}(y,\, \rho )\big ) \leq \sigma (\textrm{Cap}_{\rho }) - \sigma (\textrm{Cap}_{\rho - \delta })$ . This implies that, for any set $A \subseteq \mathbb {S}^d$ and any $x,\, y \in \mathbb {S}^d$ with $\|x - y\| \leq \delta $ , we have
By our telescoping sum trick, whenever $\|Q - P\|_{\infty } \leq \delta $ , we conclude that
Take $\rho> 0$ small enough so that $\eta _P'(\rho ) \leq \varepsilon /3$ , and for this value of $\rho $ take $0 < \delta < r$ small enough so that $\sigma (\textrm{Cap}_{\rho - \delta }) \geq (1 - \varepsilon /3k) \,\sigma (\textrm{Cap}_{\rho })$ . Then, for any $Q \in \mathcal {B}(P,\, \delta )$ and any measurable set $A \subseteq \mathbb {S}^d$ , we have
as wished.
3.3 The spherical Supersaturation Theorem
Having proven that the counting function for admissible spherical configurations is robust to various kinds of small perturbations, we next show that it also satisfies a useful supersaturation property.
This is the second main technical tool we need to study the independence density in the spherical setting, and due to the fact that the unit sphere is compact, both its statement and proof are somewhat simpler than in the Euclidean space setting.
Theorem 10 (Supersaturation Theorem).
For every admissible configuration P on $\mathbb {S}^d$ and every $\varepsilon> 0$ , there exists a constant $c(\varepsilon )> 0$ such that the following holds: if $A \subseteq \mathbb {S}^d$ satisfies $\sigma (A) \geq \mathbf {m}_{\mathbb {S}^d}(P) + \varepsilon $ , then $I_P(A) \geq c(\varepsilon )$ .
Proof. Suppose, for contradiction, that the result is false; then, there exist some $\varepsilon> 0$ and some sequence $(A_i)_{i \geq 1}$ of sets, each of density at least $\mathbf {m}_{\mathbb {S}^d}(P) + \varepsilon $ , which satisfy $\lim _{i \rightarrow \infty } I_P(A_i) = 0$ .
Note that the unit ball $\mathcal {B}_{\infty }$ of $L^{\infty }(\mathbb {S}^d)$ is weak $^*$ compact and also metrizable in this topology (see [Reference Megginson15, Chapter 2.6]). By possibly restricting to a subsequence, we may then assume that $(A_i)_{i \geq 1}$ converges in the weak $^*$ topology of $L^{\infty }(\mathbb {S}^d)$ ; let us denote its limit by $A \in \mathcal {B}_{\infty }$ . It is clear that $0 \leq A \leq 1$ almost everywhere, and $\int _{\mathbb {S}^d} A(x) \,d\sigma (x) = \lim _{i \rightarrow \infty } \sigma (A_i) \geq \mathbf {m}_{\mathbb {S}^d}(P) + \varepsilon $ . By weak $^*$ continuity of $I_P$ (Lemma 14), we also have $I_P(A) = \lim _{i \rightarrow \infty } I_P(A_i) = 0$ .
Now let $B := \{x \in \mathbb {S}^d:\, A(x) \geq \varepsilon \}$ . Since
we conclude that $I_P(B) \leq \varepsilon ^{-|P|} I_P(A) = 0$ and
But this set B contradicts Lemma 11, finishing the proof.
It will be useful to also introduce a spherical analogue of the zooming-out operator, which acts on measurable spherical sets and represents the points on the sphere around which the considered set has a somewhat high density. Given quantities $\delta $ , $\gamma> 0$ , we denote by $\mathcal {Z}_{\delta }(\gamma )$ the operator which takes a measurable set $A \subseteq \mathbb {S}^d$ to the set
The most important property of the zooming-out operator is the following result.
Corollary 2. For every admissible configuration P on $\mathbb {S}^d$ and every $\varepsilon> 0$ , there exists $\delta _0> 0$ such that the following holds for all $\delta \leq \delta _0$ : if $A \subseteq \mathbb {S}^d$ satisfies
then A contains a congruent copy of P.
Proof. By the Supersaturation Theorem, we know that
holds for all $\delta> 0$ . By the Counting Lemma, we then have
Since $\eta _P(\delta ) \rightarrow 0$ as $\delta \rightarrow 0$ , there is some $\delta _0> 0$ such that for all $\delta \leq \delta _0$ , we can conclude $I_P(A)> 0$ ; this implies that A contains a copy of P.
3.4 From the sphere to spherical caps
We must now tackle the problem of obtaining a relationship between the independence density $\mathbf {m}_{\mathbb {S}^d}(P)$ of a given configuration $P \subset \mathbb {S}^d$ and its spherical cap version $\mathbf {m}_{\textrm{Cap}(x, \rho )}(P)$ , as this will be needed later.
In the Euclidean setting, this was very easy to do (see Lemma 1), using the fact that we can tessellate $\mathbb {R}^d$ with cubes $Q(x, R)$ of any given side length $R> 0$ . This is no longer the case in the spherical setting, as it is impossible to completely cover $\mathbb {S}^d$ using nonoverlapping spherical caps of some given radius; in fact, this cannot be done even approximately if we require the radii of the spherical caps to be the same (as we did with the side length of the cubes in $\mathbb {R}^d$ ).
We will then need to use a much weaker ‘almost-covering’ result, saying that we can cover almost all of the sphere by using finitely many nonoverlapping spherical caps with possibly different radii. Such a collection of disjoint spherical caps is called a cap packing. For technical reasons, we will also want the radii of the caps in this packing to be arbitrarily small.
Lemma 16. For every $\varepsilon> 0$ , there is a finite cap packing
of $\mathbb {S}^d$ with density $\sigma (\mathcal {P})> 1 - \varepsilon $ and with radii $\rho _i \leq \varepsilon $ for all $1 \leq i \leq N$ .
Proof. We will use the same notation for both a collection of caps and the set of points on $\mathbb {S}^d$ which belong to (at least) one of these caps. The desired packing $\mathcal {P}$ will be constructed in several steps, starting with $\mathcal {P}_0 := \{\textrm{Cap}(e, \varepsilon )\}$ .
Now, suppose $\mathcal {P}_{i-1}$ has already been constructed (and is finite) for some $i \geq 1$ and let us construct $\mathcal {P}_i$ . Define
and note that $\mathcal {C}_i$ is a covering of $\mathbb {S}^d \setminus \mathcal {P}_{i-1}$ by caps of positive radii (since $\mathcal {P}_{i-1}$ is closed on $\mathbb {S}^d$ ). By the Vitali Covering Lemma, there is a countable subcollection
of disjoint caps in $\mathcal {C}_i$ such that $\mathbb {S}^d \setminus \mathcal {P}_{i-1} \subseteq \bigcup _{j=1}^{\infty } \textrm{Cap}(x_j, 5r_j)$ . In particular,
where we denote $K_d := \sup _{r> 0} \sigma (\textrm{Cap}_{5r})/\sigma (\textrm{Cap}_r) < \infty $ . Taking $N_i \in \mathbb {N}$ such that
we see that $\mathcal {P}_i' := \{\textrm{Cap}(x_j, r_j):\, 1 \leq j \leq N_i\} \subset \mathbb {S}^d \setminus \mathcal {P}_{i-1}$ satisfies
Now, set $\mathcal {P}_i := \mathcal {P}_{i-1} \cup \mathcal {P}_i'$ ; this is a finite cap packing with
(where the last inequality follows by induction). Taking $n \geq 1$ large enough so that $(1 - \sigma (\textrm{Cap}_{\varepsilon })) \Big (1 - \frac {1}{2 K_d}\Big )^n < \varepsilon $ , we see that $\mathcal {P} := \mathcal {P}_n$ satisfies all requirements.
We can now obtain our analogue of Lemma 1, relating the two versions of independence density in the spherical setting.
Lemma 17. For every $\varepsilon> 0$ , $\rho> 0$ , there exists $t_0> 0$ such that the following holds whenever $P_1, \dots , P_n \subset \mathbb {S}^d$ have diameter at most $t_0$ :
Proof. If $A \subset \mathbb {S}^d$ is a set which avoids $P_1, \dots , P_n$ , then for every $x \in \mathbb {S}^d$ , the set $A \cap \textrm{Cap}(x, \rho ) \subseteq \textrm{Cap}(x, \rho )$ also avoids $P_1, \dots , P_n$ . Since $\mathbb {E}_{x \in \mathbb {S}^d}[d_{\textrm{Cap}(x, \rho )}(A)] = \sigma (A)$ , there must exist some $x \in \mathbb {S}^d$ such that
optimizing over A, we conclude that $\mathbf {m}_{\textrm{Cap}(x, \rho )}(P_1,\, \dots ,\, P_n) \geq \mathbf {m}_{\mathbb {S}^d}(P_1,\, \dots ,\, P_n)$ .
For the opposite direction, let $\gamma \leq \varepsilon /4$ be small enough so that $\sigma (\textrm{Cap}_{\rho + \gamma }) \leq (1 + \varepsilon /4)\, \sigma (\textrm{Cap}_{\rho })$ . By Lemma 16, we know there is a cap packing
of $\mathbb {S}^d$ with $\sigma (\mathcal {P}) \geq 1 - \gamma $ and $0 < \rho _1, \dots , \rho _N \leq \gamma $ . Now, let $t_0> 0$ be small enough so that $\sigma (\textrm{Cap}_{\rho _i - 2t_0}) \geq (1 - \varepsilon /4)\, \sigma (\textrm{Cap}_{\rho _i})$ for all $1 \leq i \leq N$ ; note that $t_0$ will ultimately depend only on $\varepsilon $ and $\rho $ .
Fixing any configurations $P_1, \dots , P_n \subset \mathbb {S}^d$ of diameter at most $t_0$ , let $A \subset \textrm{Cap}(x, \rho )$ be a set which avoids all of them. We shall construct a set $\widetilde {A} \subset \mathbb {S}^d$ which also avoids $P_1, \dots , P_n$ , and which satisfies $\sigma (\widetilde {A})> d_{\textrm{Cap}(x, \rho )}(A) - \varepsilon $ ; this will finish the proof.
For each $1 \leq i \leq N$ , denote $\widetilde {\rho }_i := \rho _i - 2t_0 < \gamma $ . We have that
Since $\widetilde {\rho }_i < \gamma $ , dividing by $\sigma (\textrm{Cap}_{\rho })$ , we obtain
There must then exist $y_i \in \textrm{Cap}(x,\, \rho )$ for which $d_{\textrm{Cap}(y_i, \widetilde {\rho }_i)}(A)> d_{\textrm{Cap}(x, \rho )}(A) - \varepsilon /4$ ; fix one such $y_i$ for each $1 \leq i \leq N$ , and let $T_{y_i \rightarrow x_i} \in \textrm{O}(\mathbb {R}^{d+1})$ be any rotation taking $y_i$ to $x_i$ (and thus taking $\textrm{Cap}(y_i,\, \widetilde {\rho }_i)$ to $\textrm{Cap}(x_i,\, \widetilde {\rho }_i)$ ).
We claim that the set
satisfies our requirements. Indeed, we have
Moreover, since $\textrm{diam}\,(P_j) \leq t_0$ and the caps $\textrm{Cap}(x_i,\, \widetilde {\rho }_i)$ are (at least) $2t_0$ -distant from each other, we see that any copy of $P_j$ in $\widetilde {A} \subset \bigcup _{i=1}^N \textrm{Cap}(x_i,\, \widetilde {\rho }_i)$ must be entirely contained in one of the the caps $\textrm{Cap}(x_i,\, \widetilde {\rho }_i)$ . But then it should also be contained (after rotation by $T_{y_i \rightarrow x_i}^{-1}$ ) in $A \cap \textrm{Cap}(y_i,\, \widetilde {\rho }_i)$ ; this shows that $\widetilde {A}$ does not contain copies of $P_j$ for any $1 \leq j \leq N$ , since A does not, and we are done.
3.5 Results on the spherical independence density
We are finally ready to start a more detailed study of the independence density parameter in the spherical setting.
We start by providing a general lower bound on the independence density of several different configurations in terms of their individual independence densities.
Lemma 18 (Supermultiplicativity).
For all configurations $P_1, \dots , P_n$ on $\mathbb {S}^d$ , we have
Proof. Choose, for each $1 \leq i \leq n$ , a set $A_i \subset \mathbb {S}^d$ which avoids configuration $P_i$ . By taking independent rotations $R_i A_i$ of each set $A_i$ , we see that
There must then exist $R_1, \dots , R_n \in \textrm{O}(\mathbb {R}^{d+1})$ for which
Since $\bigcap _{i=1}^n R_i A_i$ avoids all configurations $P_1, \dots , P_n$ and the sets $A_1, \dots , A_n$ were chosen arbitrarily, the result follows.
Using supersaturation, we can show that this lower bound is essentially tight when the configurations considered are all admissible and each one is at a different size scale. Intuitively, this happens because the constraints of avoiding each of these configurations will act at distinct scales and thus not correlate with each other.
Theorem 11 (Asymptotic independence).
For every admissible configuration $P_1, \dots $ , $P_n$ on $\mathbb {S}^d$ and every $0 < \varepsilon \leq 1$ , there is a positive increasing function $f: (0, 1] \rightarrow (0, 1]$ such that the following holds: whenever $0 < t_1, \dots , t_n \leq 1$ satisfy $t_{i+1} \leq f(t_i)$ for $1 \leq i < n$ , we have
Proof. We have already seen that $\mathbf {m}_{\mathbb {S}^d}(t_1 P_1,\, \dots ,\, t_n P_n) \geq \prod _{i=1}^n \mathbf {m}_{\mathbb {S}^d}(t_i P_i)$ , so it suffices to show that $\mathbf {m}_{\mathbb {S}^d}(t_1 P_1,\, \dots ,\, t_n P_n) \leq \prod _{i=1}^n \mathbf {m}_{\mathbb {S}^d}(t_i P_i) + \varepsilon $ for suitably separated $t_1, \dots , t_n \leq 1$ . We will do so by induction on n, with the base case $n=1$ being trivial (and taking $f \equiv 1$ ).
Suppose then $n \geq 2$ and we have already proven the result for $n-1$ configurations. Let $\tilde {f}: (0, 1] \rightarrow (0, 1]$ be the function promised by the theorem applied to the $n-1$ configurations $P_2, \dots , P_n$ and with accuracy $\varepsilon $ , so that whenever $0 < t_2 \leq 1$ and $0 < t_{j+1} \leq \tilde {f}(t_j)$ for each $2 \leq j < n$ , we have
By the corollary to the Supersaturation Theorem (Corollary 2), for all $0 < t_1 \leq 1$ , there is $\delta _0 = \delta _0(\varepsilon ;\, t_1 P_1)> 0$ such that
Applying Lemma 17 with radius $\rho = \delta _0$ , we see there is $t_0 = t_0(\varepsilon ,\, \delta _0)> 0$ for which
holds whenever $0 < t_2, \dots , t_n \leq t_0/2$ .
Let now $0 < t_1, \dots , t_n \leq 1$ be numbers satisfying
If $A \subset \mathbb {S}^d$ does not contain copies of $t_1 P_1, \dots , t_n P_n$ , then by the preceding discussion, we must have $\sigma \big (\mathcal {Z}_{\delta _0}(\varepsilon )[A]\big ) < \mathbf {m}_{\mathbb {S}^d}(t_1 P_1) + \varepsilon $ and, for all $x \in \mathbb {S}^d$ ,
This means that, inside caps $\textrm{Cap}(x, \delta _0)$ of radius $\delta _0$ , A has density less than $\varepsilon $ (when $x \notin \mathcal {Z}_{\delta _0}(\varepsilon )[A]$ ) except on a set of measure at most $\mathbf {m}_{\mathbb {S}^d}(t_1 P_1) + \varepsilon $ , when it instead has density at most $\prod _{j=2}^n \mathbf {m}_{\mathbb {S}^d}(t_j P_j) + 2\varepsilon $ . Taking averages, we conclude that
It thus suffices to take the function $f: (0, 1] \rightarrow (0, 1]$ given by
to conclude the induction.
Note that this result provides a partial answer to the analogue of question (Q1) in the spherical setting: if P is admissible, then $\mathbf {m}_{\mathbb {S}^d}(t_1 P,\, t_2 P,\, \dots ,\, t_n P)$ decays exponentially with n as the ratios $t_{j+1}/t_j$ between consecutive scales go to zero (recall from Lemma 10 that $\mathbf {m}_{\mathbb {S}^d}(t P)$ is bounded away from both zero and one for $0 < t \leq 1$ ).
By considering an infinite sequence of ‘counterexamples’ as we did in our proof of Bourgain’s Theorem (Theorem 2), we immediately obtain from Theorem 11 the following result.
Corollary 3. Let $P \subset \mathbb {S}^d$ be an admissible configuration. If $A \subseteq \mathbb {S}^d$ has positive measure, then there is some number $t_0> 0$ such that A contains a congruent copy of $t P$ for all $t \leq t_0$ .
This corollary can be seen as the counterpart to Bourgain’s Theorem in the spherical setting, where it impossible to consider arbitrarily large dilates (the equivalent result of containing all sufficiently small dilates of a configuration in the Euclidean setting also holds with the same proof).
We will next prove that the independence density function $P \mapsto \mathbf {m}_{\mathbb {S}^d}(P)$ is continuous on the set of admissible configurations on $\mathbb {S}^d$ . Before doing so, it is interesting to note that a similar result does not hold for two-point configurations on the unit circle $\mathbb {S}^1$ (which can be seen as the very first instance of nonadmissible configurations). Indeed, it was shown by DeCorte and Pikhurko [Reference DeCorte and Pikhurko7] that $\mathbf {m}_{\mathbb {S}^1}(\{u, v\})$ is discontinuous at a configuration $\{u, v\} \subset \mathbb {S}^1$ whenever the arc length between u and v is a rational multiple of $2\pi $ with odd denominator.
Theorem 12 (Continuity of the independence density).
For any $n \geq 1$ , the function $(P_1, \dots , P_n) \mapsto \mathbf {m}_{\mathbb {S}^d}(P_1, \dots , P_n)$ is continuous on the set of n admissible spherical configurations.
Proof. For simplicity of exposition, we will prove the result in the case of only one forbidden configuration, but the general case follows from the same argument.
Fix some $\varepsilon> 0$ and some admissible configuration P on $\mathbb {S}^d$ and let $c(\varepsilon )> 0$ be the constant promised by the Supersaturation Theorem (Theorem 10). By equicontinuity (Lemma 15), there exists $\delta> 0$ such that
Suppose $Q \in \mathcal {B}(P, \delta )$ and $A \subset \mathbb {S}^d$ is a measurable set avoiding Q; we must then have $I_P(A) \leq c(\varepsilon )$ , and so $\sigma (A) \leq \mathbf {m}_{\mathbb {S}^d}(P) + \varepsilon $ . Optimizing over A, we conclude that $\mathbf {m}_{\mathbb {S}^d}(Q) \leq \mathbf {m}_{\mathbb {S}^d}(P) + \varepsilon $ whenever $Q \in \mathcal {B}(P, \delta )$ .
Now, write $P = \{v_1, \dots , v_k\}$ and consider the function $g_P: (\mathbb {S}^d)^k \times \textrm{O}(\mathbb {R}^{d+1}) \rightarrow \mathbb {R}$ given by
Note that this function is continuous and that $\min _{T \in \textrm{O}(\mathbb {R}^{d+1})} g_P(x_1, \dots , x_k, T) = 0$ if and only if $(x_1, \dots , x_k)$ is congruent to $(v_1, \dots , v_k)$ .
By inner regularity, we can find a compact set $A \subset \mathbb {S}^d$ which avoids P and has measure $\sigma (A) \geq \mathbf {m}_{\mathbb {S}^d}(P) - \varepsilon $ . The continuous function $g_P$ attains a minimum on the compact set $A^k \times \textrm{O}(\mathbb {R}^{d+1})$ ; denote this minimum by $\gamma $ and note that $\gamma> 0$ since A avoids P. Let us show that A also avoids Q, for all $Q \in \mathcal {B}(P, \gamma /2k)$ . Indeed, writing $Q = \{u_1, \dots , u_k\}$ (with the labels chosen so as to minimize their distance to the corresponding points of P), for any points $x_1, \dots , x_k \in A$ and any $T \in \textrm{O}(\mathbb {R}^{d+1})$ , we have that
which is at least $\gamma /2> 0$ if $\|Q - P\|_{\infty } \leq \gamma /2k$ . For such configurations, we then obtain
We conclude that $|\mathbf {m}_{\mathbb {S}^d}(Q) - \mathbf {m}_{\mathbb {S}^d}(P)| \leq \varepsilon $ whenever $\|Q - P\|_{\infty } \leq \min \{\delta , \gamma /2k\}$ , finishing the proof.
As our definition of the independence density $\mathbf {m}_{\mathbb {S}^d}(P)$ involved a supremum over all P-avoiding measurable sets $A \subseteq \mathbb {S}^d$ , it is not immediately clear whether there actually exists a measurable P-avoiding set attaining this extremal value of density. In fact, such a result is false in the case where $d = 1$ and we are considering two-point configurations $\{u, v\} \subset \mathbb {S}^1$ : if the length of the arc between u and v is not a rational multiple of $\pi $ , it was shown by Székely [Reference Székely23] that $\mathbf {m}_{\mathbb {S}^1}(\{u, v\}) = 1/2$ , but there is no $\{u, v\}$ -avoiding measurable set of density $1/2$ .
We will now show that extremizer sets exist whenever the configuration we are forbidding is admissible. Note that the result also holds (with essentially unchanged proof) when forbidding several admissible configurations; this generalizes to higher-order configurations a theorem of DeCorte and Pikhurko [Reference DeCorte and Pikhurko7] for forbidden distances on the sphere.
Theorem 13 (Existence of extremizers).
If $P \subset \mathbb {S}^{d}$ is an admissible configuration, then there exists a P-avoiding measurable set $A \subseteq \mathbb {S}^d$ attaining $\sigma (A) = \mathbf {m}_{\mathbb {S}^d}(P)$ .
Proof. Let $A_1, A_2, \dots \subseteq \mathbb {S}^d$ be a sequence of P-avoiding measurable sets satisfying $\lim _{i \rightarrow \infty } \sigma (A_i) = \mathbf {m}_{\mathbb {S}^d}(P)$ . By passing to a subsequence if necessary, we may assume that $(A_i)_{i \geq 1}$ converges to some function $A \in \mathcal {B}_{\infty }$ in the weak $^*$ topology of $L^{\infty }(\mathbb {S}^d)$ . We shall prove two things:
-
(i) the limit function A is $\{0,\, 1\}$ -valued almost everywhere, so we can identify it with its support $\text {supp} \,A$ ;
-
(ii) after possibly modifying it on a zero-measure set, this set A will avoid P.
With these two results we will be done, since $\sigma (A) = \lim _{i \rightarrow \infty } \sigma (A_i) = \mathbf {m}_{\mathbb {S}^d}(P)$ .
By weak $^*$ convergence, we know that $0 \leq A \leq 1$ almost everywhere, and by weak $^*$ continuity, (Lemma 14) we also have $I_P(A) = \lim _{i \rightarrow \infty } I_P(A_i) = 0$ . From this, we easily conclude that $I_P(\text {supp} \,A) = 0$ , and also
But Lemma 11 implies that $\sigma (\text {supp} \,A) \leq \mathbf {m}_{\mathbb {S}^d}(P)$ , which by (16) and the fact that $0 \leq A \leq 1$ , can only happen if $A = \text {supp} \,A$ almost everywhere. This proves $(i)$ .
Identifying A with its support and using that $I_P(A) = 0$ , Lemma 11 implies we can remove a zero-measure subset of A in order to remove all copies of P. This proves item $(ii)$ and finishes the proof of the theorem.
To conclude, let us make explicit what we can say about the possible independence densities when forbidding n distinct contractions of an admissible configuration P; due to lack of dilation invariance in the spherical setting, characterizing these values in terms of simpler quantities is much harder than it is in the Euclidean setting.
Denote $\mathcal {M}_n^{\mathbb {S}^d}(P) := \big \{\mathbf {m}_{\mathbb {S}^d}(t_1 P,\, t_2 P,\, \dots ,\, t_n P):\, 0 < t_1 < t_2 < \dots < t_n \leq 1\big \}$ . Due to continuity of $\mathbf {m}_{\mathbb {S}^d}$ (Theorem 12), this set is an interval and its upper extremity is $\sup _{0 < t \leq 1} \mathbf {m}_{\mathbb {S}^d}(t P)$ . By supermultiplicativity (Lemma 18), the lower extremity of $\mathcal {M}_n^{\mathbb {S}^d}(P)$ is at least $\inf _{0 < t \leq 1} \mathbf {m}_{\mathbb {S}^d}(t P)^n$ , and by asymptotic independence (Theorem 11), it can be at most $\inf _{0 < t \leq 1} \mathbf {m}_{\mathbb {S}^d}(t P) \cdot \liminf _{t \rightarrow 0} \mathbf {m}_{\mathbb {S}^d}(t P)^{n-1}$ .
4 Concluding remarks and open problems
Our results leave open the question of what happens when the configurations we forbid are not admissible. There are two different reasons for a given configuration (either on the space or on the sphere) to not be admissible, so let us examine them separately.
The fist reason is that P is degenerate, meaning that its points are affinely dependent if we are on $\mathbb {R}^d$ or linearly dependent if we are on $\mathbb {S}^d$ . In the Euclidean setting, Bourgain [Reference Bourgain2] showed an example of sets $A_d \subset \mathbb {R}^d$ (for each $d \geq 2$ ) which have positive density but which avoid arbitrarily large dilates of a degenerate three-point configuration of the form $\{-v, 0, v\}$ . These sets then show that the conclusion of Bourgain’s Theorem (and thus also the conclusion of our Theorem 4) is false for this degenerate configuration.
This counterexample was later generalized by Graham [Reference Graham13], who showed that a result like Bourgain’s Theorem can only hold if P is contained on the surface of some sphere of finite radius (as is always the case when P is nondegenerate). In fact, Graham’s result implies (for instance) that
whenever $P \subset \mathbb {R}^d$ is nonspherical, that is, not contained on the surface of any sphere. Some kind of nondegeneracy hypothesis is thus necessary both for Bourgain’s result and for our Theorem 4.Footnote 11
It is interesting to note, however, that more recent results of Ziegler [Reference Ziegler24, Reference Ziegler25] (generalizing a theorem of Furstenberg, Katznelson and Weiss [Reference Furstenberg, Katznelson and Weiss12] for three-point configurations) show that every set $A \subseteq \mathbb {R}^d$ of positive upper density is arbitrarily close to containing all large enough dilates of any finite configuration $P \subset \mathbb {R}^d$ . More precisely, denoting by $A_{\delta }$ the set of all points at distance at most $\delta $ from the set A, Ziegler proved the following.
Theorem 14. Let $A \subseteq \mathbb {R}^d$ be a set of positive upper density and $P \subset \mathbb {R}^d$ be a finite set. Then there exists $t_0> 0$ such that, for any $t \geq t_0$ and any $\delta> 0$ , the set $A_{\delta }$ contains a configuration congruent to $t P$ .
The proof of this theorem is ergodic theoretic in nature, making essential use of deep and difficult results regarding nilflows and the characteristic factors of nonconventional ergodic averages. It unfortunately does not seem to follow from our methods.
Let us now turn to the second reason for a configuration P on $\mathbb {R}^d$ or $\mathbb {S}^d$ to be nonadmissible, namely that it contains $d+1$ points (if it has more than $d+1$ points, then it is obviously degenerate). In this case, we cannot apply the same strategy we used to prove the Counting Lemmas, and it is not clear whether they or the analogues of Bourgain’s Theorem are true. We conjecture that they are whenever $d \geq 2$ , so that we can remove the cardinality condition from the statement of Bourgain’s result and of our ‘asymptotic independence’ Theorem 4 and Theorem 11.
In particular, let us make more explicit the simplest case of this conjecture, which is an obvious question left open since the results of Bourgain and of Furstenberg, Katznelson and Weiss:
Conjecture 1. Let $A \subset \mathbb {R}^2$ be a set of positive upper density and let $u, v, w \in \mathbb {R}^2$ be noncollinear points. Then, there exists $t_0> 0$ such that for any $t \geq t_0$ , the set A contains a configuration congruent to $\{t u, t v, t w\}$ .
Another question we ask is related to a suspected compatibility condition between the Euclidean and spherical settings. Since $\mathbb {S}^d$ resembles $\mathbb {R}^d$ at small scales, it seems geometrically intuitive that $\mathbf {m}_{\mathbb {S}^d}(t P)$ should get increasingly close to $\mathbf {m}_{\mathbb {R}^d}(P)$ as $t \rightarrow 0$ whenever P is a contractible configuration on $\mathbb {S}^d$ (it is easy to show that a configuration $P \subset \mathbb {S}^d$ is contractible if and only if it is contained in a d-dimensional affine subspace, so we can embed it in $\mathbb {R}^d$ ). We ask whether this intuition is indeed correct (i.e., is it true that $\lim _{t \rightarrow 0} \mathbf {m}_{\mathbb {S}^d}(t P) = \mathbf {m}_{\mathbb {R}^d}(P)$ for all contractible configurations $P \subset \mathbb {S}^d$ ?).
In a more combinatorial perspective, we wish to know whether an analogue of the Hypergraph Removal Lemma holds for forbidden geometrical configurations. In intuitive terms, the question we ask is whether a measurable set A (either on $\mathbb {R}^d$ or on $\mathbb {S}^d$ ) which contains ‘few’ copies of some given configuration P can be made P-avoiding by removing only ‘a few’ of its points.Footnote 12 Such a result would then explain geometrical sets having few copies of P as those which are close to a set avoiding this configuration, and it trivially implies the corresponding Supersaturation Theorem; note that this is a quantitative and stronger version of our zero-measure removal Lemmas 2 and 11.
Finally, it would be very interesting to have a way of obtaining good upper bounds for the independence densities of a given configuration or family of configurations. There are several papers (see [Reference Bachoc, Passuello and Thiery1, Reference DeCorte, de Oliveira Filho and Vallentin6] and the references therein) which consider this question in the case of a single two-point configuration, drawing on powerful methods from the theory of conic optimization and representation theory, and it is already quite challenging in this simplest case. Oliveira and Vallentin also considered the case of several forbidden two-point configurations in Euclidean space [Reference Ode Oliveira Filho and Vallentin16] and in arbitrary compact, connected, rank-one symmetric spaces [Reference Ode Oliveira Filho and Vallentin17]; they use linear and semidefinite programming methods to prove that the independence density of n distinct two-point configurations decays exponentially with n if their sizes are sufficiently far apart.Footnote 13
We believe that the study of the independence density for higher-order configurations in the optimization setting is also worthwhile, since they serve as model problems for symmetric optimization problems depending on higher-order relations and might prove very fruitful in new methods developed.
Acknowledgements
This work was carried out while the author was a Ph.D. student at the University of Cologne. The author would like to thank Fernando de Oliveira Filho, Lucas Slot and Frank Vallentin for many helpful discussions. We also thank the anonymous reviewer, Fernando de Oliveira Filho and Frank Vallentin for several suggestions which improved the presentation of this paper.
Competing interest
The author has no competing interest to declare.
Financial support
This work was supported by the European Union’s EU Framework Programme for Research and Innovation Horizon 2020 under the Marie Skłodowska-Curie Actions Grant Agreement No 764759 (MINOA), and by the Dutch Research Council (NWO) as part of the NETWORKS programme (grant no. 024.002.003).