1. Introduction
Dependence in multivariate linear factor models is determined by a collection of independent random variables, called factors, which are shared by the modelled variables. In extreme value analysis there are the max-linear and the additive factor models with heavy-tailed factors. In [Reference Einmahl, Krajina and Segers10], it is shown that both have the same max-domain of attraction.
In [Reference Gissibl and Klüppelberg13], a link is made between such factor models and probabilistic graphical models via a max-linear recursively defined structural equation model on a directed acyclic graph (DAG). Each node carries a variable defined as a weighted maximum of its parent variables and an independent factor. This leads to a representation of the graphical model as a (max-)factor model as in [Reference Einmahl, Krajina and Segers10], the factors relevant for a given variable being limited to the set of its ancestors. More recent is the linear causally structured model in [Reference Gnecco, Meinshausen, Peters and Engelke16]: each variable is the weighted sum of the variables on all its parent nodes plus an independent factor. This leads to a representation where a single variable is a weighted sum of all its ancestral factors.
In this paper, we study a type of graph that, to the best of our knowledge, is not yet known, and to which we have given a name that reflects its most important properties: a tree of transitive tournaments (TTT), denoted by $\mathcal{T}$ . A tournament is a graph obtained by directing a complete graph, while a tournament is said to be transitive if it has no directed cycles. The name reflects the interpretation of such a graph as a competition where every node is a player and a directed edge points from the winner to the loser. Some examples are hierarchical relations between members of animal and bird societies, brand preferences, and votes between two alternative policies [Reference Harary and Moser18]. A TTT links up several such transitive tournaments in a tree-like structure. It is acyclic by construction. If there is a directed path from one node to another one, there is a unique shortest such path. Moreover, between any pair of nodes, there is a unique shortest undirected path.
In this paper, we study max-linear graphical models with respect to a TTT as defined in (2) below. In particular, for a max-linear random vector $X=(X_v, v\in V)$ with node set V, we study the limit in distribution
It is not hard to show that the limit distribution in (1) is discrete [Reference Segers31, Example 1]. We show that if the TTT has a unique node without parents, a so-called source node, the joint distribution of $(A_{uv}, v\in V)$ is determined by products of independent multiplicative increments along the unique shortest undirected paths between the node u at which the high threshold is exceeded on the one hand and the rest of the nodes on the other hand. Such behaviour is analogous to that of Markov random fields on block graphs in [Reference Asenova and Segers5] and of Markov trees in [Reference Segers31, Theorem 1]. In turn, these results go back to the extensive literature on the additive or multiplicative structure of extremes for Markov chains [Reference Janssen and Segers20, Reference Resnick and Zeber28, Reference Segers29, Reference Smith32, Reference Yun36].
An underlying reason for the factorization into independent increments is the fact that a max-linear graphical model with respect to a TTT is a Markov random field with respect to the undirected graph associated to the original, directed graph when the TTT has a unique source. A TTT with unique source has no v-structures, that is, no nodes with non-adjacent parents. Both properties, the factorization of the limiting variables and the Markovianity with respect to the undirected graph, are lost if the graph contains v-structures. To show this, we rely on the recent theory of conditional independence in max-linear Bayesian networks based on the notion of $*$ -connectedness [Reference Améndola, Hollering, Sullivant, Tran, de Campos, Maathuis and Research Press1, Reference Améndola, Klüppelberg, Lauritzen and Tran2]. This theory diverges from classical results on conditional independence in Bayesian networks based on the notion of d-separation [Reference Koller and Friedman23, Reference Lauritzen25].
In our paper the graph is given. A significant line of research in the context of extremal dependence is graph discovery. Given observations on a number of variables represented as nodes in a graph, the task is to estimate the edges. For Bayesian networks we can also talk about causality discovery, because directed edges show the direction of influence. A first attempt to identify the DAG in the context of max-linear models was [Reference Gissibl, Klüppelberg and Otto15], which was followed by several papers focusing on this topic: [Reference Buck and Klüppelberg8], [Reference Gissibl, Klüppelberg and Lauritzen14], [Reference Klüppelberg and Krali21], [Reference Tran, Buck and Klüppelberg33], and [Reference Tran, Buck and Klüppelberg34]. The problems related to identifiability of the true graph and to the estimation of the edge weights are discussed in [Reference Klüppelberg and Lauritzen22]. The paper [Reference Gnecco, Meinshausen, Peters and Engelke16] studies a new metric called the causal tail coefficient, which is shown to reveal the structure of a linear causal recursive model with heavy-tailed noise. Graph discovery for non-directed graphs is studied in [Reference Engelke and Hitz11], [Reference Engelke and Volgushev12], and [Reference Hu, Peng and Segers19].
Inspired by practice, and more specifically by river network applications [Reference Asenova, Mazo and Segers4], we study a different identifiability problem. If the structure of the graph is known, it may happen that on some nodes the variables are latent, i.e., unobserved. The identifiability problem in this case is whether two different parameter vectors can still generate the same distribution of the observable part of the model. If this is possible then we cannot uniquely identify all tail dependence parameters that characterize the full distribution. Similarly to [Reference Asenova and Segers5], the identifiability criterion involves properties of the nodes with latent variables. The criterion is specific for a TTT with unique source and is easy to check. Our identifiability problem resembles the ‘method of path coefficients’ of Sewall Wright, which uses a system of equations involving correlations to solve for the edge coefficients [Reference Wright35].
The novelty of the paper lies in several directions. First, we introduce a new class of graphs, called trees of transitive tournaments (TTT), which are the directed acyclic analogue of block graphs. TTTs can be seen as a generalization of directed trees, where edges are replaced by transitive tournaments. Second, we show that a max-linear graphical model over a TTT with unique source exhibits properties known for other graphical models, namely Markov trees [Reference Segers31] and Markov block graphs [Reference Asenova and Segers5]. In particular, when the TTT has a unique source, the model is Markov with respect to the skeleton of the graph. This property underlies the factorization of the tail limit into independent increments along the unique shortest trails. Finally, we study a problem of identifiability of the edge weights from the angular measure, both when all variables are observed and when some of them are latent.
The structure of the paper is as follows. In Section 2 we introduce the TTT, the max-linear model, and its angular measure, which plays a key role in almost all proofs. In Section 3 we discuss the limiting distribution of (1) and give four equivalent characterizations of a max-linear graphical model with respect to a TTT with unique source. The identifiability problem is covered in Section 4. The discussion summarizes the main points of the paper. The appendices contain some additional lemmas and the proofs that are not presented in the main text.
2. Notions and definitions
2.1. Directed graphs
Let $\mathcal{T}=(V,E)$ be a directed acyclic graph (DAG) with finite vertex (node) set V and edge set $E\subset V\times V$ . An edge $e\;:\!=\;(u,v)\in E$ is directed, meaning $(u,v)\neq (v,u)$ ; it is outgoing with respect to the parent node u and incoming with respect to the child node v. The graph $\mathcal{T}$ excludes loops, i.e., edges of the form (u, u), and as $\mathcal{T}$ is directed, we cannot have both $(u,v)\in E$ and $(v,u)\in E$ . Two nodes u and v are adjacent if (u, v) or (v, u) is an edge. A cycle is a sequence of edges $e_1,\ldots,e_n$ with $e_k = (u_k, u_{k+1})$ and $u_1 = u_{n+1}$ for some nodes $u_1,\ldots,u_n$ . The property that $\mathcal{T}$ is acyclic means that it does not contain any cycles. The graph $\mathcal{T}$ is assumed connected; i.e., for any two distinct nodes u and v, we can find nodes $u_1=u,u_2,\ldots,u_{n+1}=v$ such that $u_k$ and $u_{k+1}$ are adjacent for every $k = 1,\ldots,n$ ; we call the associated edge sequence an undirected path or a trail between u and v. If all edges are directed in the same sense, i.e., $(u_k, u_{k+1}) \in E$ for all $k = 1,\ldots,n$ , we refer to the edge sequence as a (directed) path from the ancestor u to the descendant v. Recall that a path is directed by convention, so when we need non-directed paths this will be indicated explicitly. Between a pair of nodes there may be several paths. The set of all paths between two nodes $u,v\in V$ is denoted by $\pi(u,v)$ . An element, say p, of $\pi(u,v)$ is a collection of edges $\{(v_1, v_2), (v_2,v_3),\ldots, (v_{n-1},v_n)\}$ for a path that involves the non-repeating nodes $\{v_1=u, v_2, \ldots, v_{n-1}, v_{n}=v\}$ . Note that $\pi(u, u) = \varnothing$ in an acyclic graph.
A source is a node without parents. If a DAG has a unique source, this node is an ancestor of every other node. This property follows from the following reasoning: let $u_0$ denote the unique source node of the DAG, and let v be any other node different from $u_0$ . Then v must have a parent, say u. If $u = u_0$ , we are done. Otherwise, replace v by u and restart. Since the graph is finite and has no cycles, this chain must stop at some moment at a node without parents. But this node is necessarily equal to $u_0$ by assumption.
A graph, directed or not, is complete if there is an edge between any pair of distinct nodes. A subgraph of a graph is biconnected if the removal of any of its nodes will not disconnect the subgraph. A maximal biconnected subgraph, also known as a biconnected component, is a subgraph that cannot be extended by adding one adjacent node without violating this principle.
A directed complete graph is called a tournament. A tournament $\tau=(V_\tau, E_\tau)$ is transitive if $(u, v), (v, w) \in E_\tau$ implies $(u, w) \in E_\tau$ . A transitive tournament is necessarily acyclic. The graph-theoretic properties of transitive tournaments are studied in [Reference Harary and Moser18]. The property most used here is that the set of out-degrees of the d nodes of a transitive tournament is $\{d-1, d-2, \ldots, 0\}$ ; the in- and out-degrees of a node are the numbers of incoming and outgoing edges, respectively.
A subgraph of a graph is a maximal transitive tournament if it is not properly contained in another subgraph which is also a transitive tournament. The set of maximal transitive tournaments that are subgraphs of a DAG $\mathcal{T}$ will be denoted by $\mathbb{T}$ . For brevity we will just write ‘tournament’ when we mean a maximal transitive tournament and denote it by $\tau$ .
2.2. Tree of transitive tournaments
A block graph is an undirected graph in which every maximal biconnected subgraph is a complete graph [Reference Le and Tuy26]. Let T denote the non-directed version of $\mathcal{T}$ , also called the skeleton of $\mathcal{T}$ . It shares the same node set as $\mathcal{T}$ , and for every edge (u, v) in the original graph $\mathcal{T}$ , the reverse edge (v, u) is added to form the edge set of the skeleton graph T, after which each pair of edges $\{(u, v), (v, u)\}$ is identified with the undirected edge $\{u, v\}$ of T.
Definition 2.1. (Tree of transitive tournaments (TTT).) A tree of transitive tournaments is a connected DAG whose skeleton is a block graph.
A TTT enjoys three key properties. They all follow from the link with block graphs, whose characteristics can be found in [Reference Le and Tuy26].
Lemma 2.1. (Properties I.) For a TTT, the following properties hold:
-
(P1) Two or more maximal transitive tournaments can have at most one common node, referred to as a separator node.
-
(P2) There is no undirected cycle that passes through nodes in different maximal transitive tournaments.
-
(P3) Between every pair of nodes there is a unique shortest trail (undirected path).
Proof. All properties are direct consequences of the fact that removing directions from the TTT we obtain a block graph. In a block graph the minimal separator sets are singletons [Reference Harary17, Theorem B]; there is a unique shortest path between two nodes [Reference Behtoei, Jannesari and Taeri6, Theorem 1.a)]; and the graph is acyclic up to blocks, a property that follows from the first one.
Similarly to a block graph [Reference Le and Tuy26], a TTT can be seen as a tree whose edges are replaced by transitive tournaments.
In a TTT, if there is at least one (directed) path between distinct nodes u and v, there is a unique shortest path (see Lemma 2.2-1) between them, which we denote by p(u, v), and which belongs to $\pi(u, v)$ . We also set $p(u,u)=\varnothing$ by convention.
A key object in the paper is a TTT with unique source. In Lemma 2.2-2 below it is shown that in this case there are no nodes with parents that are not adjacent or ‘married’, a configuration also known as a v-structure [Reference Koller and Friedman23].
Consider the TTT in Figure 1, which presents some of the notions introduced above. Each tournament is acyclic, and we cannot find a cycle passing through different tournaments either. This is why we call such a graph a tree of transitive tournaments. There are three v-structures: one on nodes 1, 3, 4, one on 3, 7, 8, and one on 5, 7, 8. The main results in this paper require a TTT without v-structures. According to Lemma 2.2, there are no v-structures in a TTT with unique source. This is illustrated in Figure 2.
Considered on its own, every tournament in a TTT has a unique source; this follows from the ordering of the out-degrees due to [Reference Harary and Moser18] mentioned earlier. When we talk about a source node, we will always state whether this is with respect to the whole graph or to a particular tournament.
In a general directed graph (V, E), let $\mathrm{pa}(v)\in V$ denote the set of parents of $v\in V$ , and put $\mathrm{Pa}(v)=\mathrm{pa}(v)\cup \{v\}$ . In a similar way, let $\mathrm{an}(v)$ , $\mathrm{desc}(v)$ , and $\mathrm{ch}(v)$ denote the sets of ancestors, descendants, and children, respectively, excluding v, while $\mathrm{An}(v)$ , $\mathrm{Desc}(v)$ , and $\mathrm{Ch}(v)$ denote the same sets but including v.
Below we present some additional properties used often in the paper.
Lemma 2.2. (Properties II.) Let $\mathcal{T}$ be a TTT as in Definition 2.1. We have the following statements:
-
1. If there is a path between two nodes, then there is a unique shortest path between them.
-
2. The TTT $\mathcal{T}$ has a unique source if and only if it possesses no v-structures.
-
3. If $\mathcal{T}$ has a unique source, then for any two nodes $i\neq j$ , either the sets $\mathrm{Desc}(i)$ and $\mathrm{Desc}(j)$ are disjoint, or one contains the other (that is, i is an ancestor of j or vice versa).
Lemma 2.3. (Properties III.) Consider a TTT $\mathcal{T}=(V,E)$ as in Definition 2.1 with unique source.
-
1. If $\{v_1, v_2, \ldots, v_n\}$ is the node sequence of a unique shortest path between nodes $v_1$ and $v_n$ , then every node except for possibly $v_1$ and $v_n$ is the source node of the tournament shared with the next node in the sequence.
-
2. For any two distinct nodes u,v in $\mathcal{T}$ , either the unique shortest trail between them is p(u,v) or p(v,u), or there exists a node $w \in V \setminus \{u, v\}$ such that the trail is composed of the two shortest paths p(w,u) and p(w,v).
2.3. Max-linear structural equation model on a TTT
Consider a directed graph, (V, E). To each edge $e = (i, j) \in E$ we associate a weight $c_e = c_{ij} \in [0, \infty)$ . The product of the edge parameters over a directed path $p = \{e_1,\ldots,e_m\}$ is denoted by
When the product is over the unique shortest path from u to v, we write $c_{p(u,v)}$ . The product over the empty set being one by convention, we have $c_{p(i,i)} = 1$ .
Let $(Z_i, i \in V)$ be a vector of independent unit-Fréchet random variables, i.e., $\operatorname{\mathbb{P}}(Z_i \le z) = \exp\!(\!-\!1/z)$ for $z > 0$ . In the spirit of [Reference Gissibl and Klüppelberg13], a recursive max-linear model on a TTT $\mathcal{T}$ is defined by
where the parameters $c_e$ , for $e\in E$ , and $c_{vv}$ , for $v\in V$ , are positive. We interpret this constraint as follows: if $c_{ij} = 0$ , the variable $X_j$ cannot be influenced by $X_i$ through the edge (i, j), and the edge could be removed from the graph. If $c_{vv} = 0$ , the factor variable $Z_v$ does not influence $X_v$ . We do not want to deal with such border cases, so we assume that all parameters in the model definition (2) are positive. According to [Reference Gissibl and Klüppelberg13, Theorem 2.2], the expression in (2) is equal also to
with
The cumulative distribution function of $X_v$ is $\operatorname{\mathbb{P}}(X_v \le x) = \exp\!(\!-\!\sum_{i \in V} b_{vi} / x)$ for $x > 0$ . We assume that $(X_v, v\in V)$ are unit-Fréchet, yielding the constraint
since $b_{vi} = 0$ whenever $i \notin \mathrm{An}(v)$ . It is thus necessary and sufficient to have
with $c_{vv}=1$ if $\mathrm{an}(v)=\varnothing$ . By (6), the coefficients $c_{vv}$ for $v \in V$ are determined recursively by the edge weights $c_{e}$ for $e \in E$ . If $c_{iv} \ge 1$ for some $(i, v) \in E$ , then (2) implies that $X_v \ge X_i \vee c_{vv} Z_v$ , and the constraint that $X_i$ and $X_v$ are unit-Fréchet distributed implies that $c_{vv} = 0$ , a case we want to exclude, as explained above. This is why we impose $0 < c_{e} < 1$ for all $e \in E$ from the start, yielding the parameter space
The notion of criticality is important for max-linear structural equation models. We refer to [Reference Améndola, Klüppelberg, Lauritzen and Tran2], [Reference Gissibl and Klüppelberg13], [Reference Gissibl, Klüppelberg and Lauritzen14], and [Reference Klüppelberg and Lauritzen22] for examples where different conditional independence relations arise depending on which path is critical, or for illustrations in the context of graph learning. According to [Reference Gissibl and Klüppelberg13, Definitionn 3.1], a path $p \in \pi(i, v)$ is max-weighted under $\theta \in \Theta$ if it realizes the maximum $\max_{p' \in \pi(i, v)} c_{p'}$ , where p ′ is any path in $\pi(i,v)$ . In [Reference Améndola, Klüppelberg, Lauritzen and Tran2] the term critical is preferred.
If there is a (directed) path between two nodes, there is a unique shortest (directed) path between them (Lemma 2.2-1). This is crucial for our parametric model. We define the critical parameter space $\Theta_* \subset (0,1)^{E}$ as the set of parameters $\theta =(c_e, e \in E)$ such that for every $v \in V$ and every $i \in \mathrm{an}(v)$ , the unique shortest directed path from i to v is the only critical path. Therefore we have $c_{p(i,v)} > c_{p}$ , with strict inequality for any $p \in \pi(i,v)$ different from p(i, v). Formally,
Next, we consider the intersection of the two spaces as an appropriate parameter space for our max-linear structural equation model:
For $\theta \in \mathring{\Theta}_*$ , every element of the max-linear coefficient matrix $B_{\theta}=(b_{vi})_{v,i\in V}$ can be rewritten using an edge weight product over the unique shortest path p(i, v) via
Also, note that $b_{ii}=c_{ii}$ , leading to the frequently used expression
Example 2.1. (Criticality.) The following example shows what happens if the assumption that all shortest paths are critical is omitted. Consider a max-linear model on three nodes $\{1,2,3\}$ and three edges $\{(1,2), (2,3),(1,3)\}$ . The corresponding edge weights are $c_{12}, c_{23}, c_{13}$ . We have
The coefficient matrix $B=\{b_{iv}\}$ from (4) together with (5) and (6) is
If the shortest path $p=\{(1,3)\}$ from node 1 to node 3 is not critical, then we have $b_{31}=c_{12}c_{23}$ and also $b_{33}=1-c_{23}$ . In this way the coefficient $c_{13}$ has completely left the model. When considering the identifiability problem, we cannot hope to identify a coefficient from some marginal distribution if it is not even identifiable from the full one.
All of the elements are now in place for us to describe our main object of interest.
Assumption 2.1. (Max-linear structural equation model on a TTT.) The random vector $X = (X_v, v \in V)$ has the max-linear representation in (3) and (8) with respect to the TTT $\mathcal{T} = (V, E)$ (Definition 2.1) where $(Z_v, v \in V)$ is a vector of independent unit-Fréchet random variables and the edge weight vector $\theta = (c_e, e \in E)$ belongs to $\mathring{\Theta}_*$ in (7).
The following identity for nodes with a unique parent will be useful:
Indeed, if i is the only parent of v, then $X_v = c_{iv} X_i \vee c_{vv} Z_v$ by (2). The variables $X_v, X_i, Z_v$ are unit-Fréchet distributed and $X_i$ is independent of $Z_v$ , since $X_i$ is a function of $(Z_u, u \in \mathrm{An}(i))$ and $v \not\in \mathrm{An}(i)$ . Hence $ c_{iv} + c_{vv} = 1 $ , and because $c_{vv} = b_{vv}$ , Equation (9) follows.
A notational convention: in the case of double subscripts, we may also write $x_{i_1,i_2}$ instead of $x_{i_1i_2}$ .
2.4. The angular measure
Let X follow a max-linear model with parameter vector $\theta$ as in Assumption 2.1. The joint distribution $P_\theta$ of X on $[0, \infty)^V$ is max-stable and has unit-Fréchet margins. It is determined by
where the stable tail dependence function (STDF) $l_{\theta} \;:\; [0, \infty)^V \to [0, \infty)$ is
for $x = (x_v)_{v \in V} \in [0, \infty)^V$ (see [Reference Einmahl, Krajina and Segers10]).
Let $H_{\theta}$ be the angular measure on the unit simplex $\Delta_V = \{ a \in [0, 1]^V \;:\; \sum_{v \in V} a^{(v)} = 1\}$ corresponding to the STDF $l_{\theta}$ . The link between the STDF and the angular measure is detailed in [Reference De Haan and Ferreira9] for the bivariate case and in [Reference Resnick27, Chapter 5] and [Reference Beirlant, Goegebeur, Segers and Teugels7, Chapters 7–8] for higher dimensions: we have
In view of the expression for $l_{\theta}$ in (10), the angular measure is discrete and satisfies
with masses $m_i = \sum_{v \in V} b_{vi}$ and atoms $a_i = (b_{vi} / m_i)_{v \in V} \in \Delta_V$ for $i \in V$ [Reference Einmahl, Krajina and Segers10, p. 1779]. The notation $\delta_{x}$ refers to a unit point mass at x.
If X follows a max-linear model, the angular measure of X is identifiable from its distribution $P_\theta$ via the limit relation
where $\|{x}\|_1 = \sum_i |x_i|$ for a vector x in Euclidean space, while the arrow $\xrightarrow{w}$ denotes weak convergence of finite Borel measures, in this case on $\Delta_V$ .
When we discuss latent variables and identifiability in Section 4, we will have to deal with the angular measure of a subvector of X, say $X_U = (X_v)_{v \in U}$ , for non-empty $U \subset V$ . Its STDF $l_{\theta, U}$ is obtained from $l_\theta$ by setting $x_v = 0$ for all $v \not\in U$ : for $x \in [0, \infty)^U$ we have
The distribution of $X_U$ is max-linear too, so that its angular measure $H_{\theta,U}$ on $\Delta_U$ has a similar form to that of X:
with masses $m_{i,U} = \sum_{v \in U} b_{vi}$ and atoms $a_{i,U} = (b_{vi} / m_{i,U})_{v \in U} \in \Delta_U$ for $i \in V$ .
3. Conditional tail limit and the TTT with unique source
Here we study the limit distribution of
when X is a max-linear model with respect to a TTT $\mathcal{T} = (V, E)$ as in Assumption 2.1. In particular, we are interested to know whether the elements of the limiting vector of (13) can be factorized into products of independent increments, similarly to other models with this property as in [Reference Asenova and Segers5, Reference Segers30]. In Proposition 3.1 below, we show that the limit variables factorize according to the unique shortest trails under the condition that the TTT has a unique source (node without parents). Moreover, by Proposition 3.2, the latter criterion is necessary and sufficient for X to satisfy the global Markov property with respect to the skeleton graph associated to $\mathcal{T}$ , i.e., the undirected counterpart of $\mathcal{T}$ .
Even though Proposition 3.1 below looks similar to Theorem 3.5 in [Reference Asenova and Segers5], it does not follow from it. The reason is that we have not been able to verify Assumptions 3.1 and 3.4 in that article for the recursive max-linear model. In these assumptions, the conditioning event involves equality, i.e., $\{X_u=t\}$ , and calculating the conditional distributions and their limits is not easy. This is why we have opted here for a different route: in (13), the conditioning event is $\{X_u>t\}$ and the limit conditional distribution as $t \to \infty$ is found from [Reference Segers31, Example 1].
According to Property (P3), any pair of distinct nodes in a TTT is connected by a unique shortest trail. Let t(u, v) denote the set of edges along the unique shortest trail between two distinct nodes u and v. Consider for instance the shortest trail between nodes 2 and 8 in Figure 2: $t(2,8)=\{(3,7), (7,8), (3,2)\}$ . In contrast, let $t_{u}(u,v)$ be the set of edges incident to the same node set but directed from u to v, irrespective of their original directions, e.g., $t_2(2,8)=\{(2,3), (3,7), (7, 8)\}$ .
For a given node $u \in V$ , let $E_u$ be the set of all edges in such unique shortest paths directed away from u, that is,
Recall from Section 2 that $\mathbb{T}$ denotes the set of tournaments within the TTT $\mathcal{T}$ . For fixed $u \in V$ there is for every tournament $\tau = (V_\tau, E_\tau) \in \mathbb{T}$ a node, say $w_{u,\tau}$ , which is the unique node in $V_\tau$ such that the trail $t(u, w_{u, \tau})$ is the shortest one among all trails between u and a node v in $V_\tau$ . As an example, consider Figure 3: starting from node $u = 8$ , the closest node from the node set $V_{\tau_1}=\{1,2,3\}$ is 3, so $w_{8,\tau_1}=3$ .
With these definitions we are ready to state the condition under which the limiting variables factorize into independent increments.
Proposition 3.1. (Factorization in max-linear model.) Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Fix $u\in V$ . Let $E_u$ be as in (14) and let $(M_e, e\in E_u)$ be a random vector composed of mutually independent subvectors $M^{(u,\tau)}=\bigl( M_{w_{u,\tau},j}\;:\; j \in V_\tau, (w_{u,\tau},j)\in E_u \bigr)$ , one for every transitive tournament $\tau\in \mathbb{T}$ , and with marginal distribution as in Lemma 3.1.
The following statements are equivalent:
-
(i) $\mathcal{T}$ has a unique source.
-
(ii) For every $u \in V$ , we have, as $t \to \infty$ , the weak convergence given by
(15) \begin{equation} \mathcal{L}(X_v/X_u, v\in V\mid X_u>t)\stackrel{d}{\longrightarrow} \mathcal{L}(A^{(u)}) =\mathcal{L}(A_{uv}, v\in V) \end{equation}with(16) \begin{equation} A_{uv}=\prod_{e\in t_u(u,v)}M_e, \qquad v\in V. \end{equation} -
(iii) There exists $u\in V$ such that the limit in (15) and (16) holds.
The following lemma provides the distribution of $M^{(u,\tau)}$ in Proposition 3.1.
Lemma 3.1. Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Let $\tau \in \mathbb{T}$ be a transitive tournament on nodes $V_{\tau}$ . Then for $u\in V_{\tau}$ , we have
The vector $M^{(u,\tau)}=(M_{uv}, v\in V_{\tau})$ has dependent variables, and the distribution of a single element is as follows:
-
1. The distribution of $M_{uv}$ when $(u,v)\in E$ :
-
(a) If u is the source node of $\tau$ , the distribution is given by $\mathcal{L}(M_{uv})=\delta_{\{c_{uv}\}}$ .
-
(b) If u is not the source node of $\tau$ , the distribution is given by
\[ \mathcal{L}(M_{uv})=\sum_{j\in \mathrm{An}(u)}b_{uj} \delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}\right\}}. \]
-
-
2. The distribution of $M_{uv}$ when $(v,u)\in E$ :
-
(a) If v is the source node of $\tau$ , the distribution is given by
\[ \mathcal{L}(M_{uv})=c_{vu}\delta_{\{1/c_{vu}\}} +(1-c_{vu})\delta_{\{0\}}. \] -
(b) If v is not the source node of $\tau$ , the distribution is given by
\[ \mathcal{L}(M_{uv}) =\sum_{j\in \mathrm{An}(v)} b_{uj}\delta_{\left\{\frac{c_{p(j,v)}}{c_{p(j,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)} b_{uj}\delta_{\{0\}}. \]
-
According to Proposition 3.1, the factorization property (16) holds either for all nodes or for no nodes at all, a necessary and sufficient condition being that the TTT has a unique source. The principle of (16) is illustrated in Figure 3 for $u = 8$ . The limit $A^{(u)}=(A_{uv}, v\in V\setminus u)$ is given by
where are independent subvectors by construction.
What underlies the link between the factorization of the limiting variables from Proposition 3.1 on the one hand and the uniqueness of the source of the TTT on the other hand is the Markovianity of X with respect to the skeleton graph T. The Markov property states that for any three non-empty and disjoint sets $A,B,C\subset V$ such that, in the graph T, the nodes in A are separated from the nodes in B by the nodes in C, the vector $X_A=(X_v, v\in A)$ is conditionally independent from $X_B$ given $X_C$ [Reference Lauritzen25]. Another equivalence condition can be added to the list in Proposition 3.1.
Proposition 3.2. Let X follow a max-linear model with respect to the TTT $\mathcal{T}$ as in Assumption 2.1. Then X satisfies the global Markov property with respect to the skeleton graph T if and only if $\mathcal{T}$ has a unique source.
Even though in Proposition 3.2 we consider the undirected graph T associated to the TTT $\mathcal{T}$ , the recursive max-linear specification of X is still with respect to the directed graph $\mathcal{T}$ itself. Indeed, the latter edges’ directions are intrinsically determined by the recursive max-linear model specification of $X = (X_v)_{v \in V}$ . When we also consider the associated skeleton graph T, i.e., without directions, it is because, in view of the factorization property in Proposition 3.1, we are interested in whether the global Markov property holds, a property which is most easily described in terms of the skeleton graph T.
The proof of Proposition 3.2 is based on notions and results from [Reference Améndola, Klüppelberg, Lauritzen and Tran2], which provides an extensive study of conditional independence properties of max-linear models. In particular, [Reference Améndola, Klüppelberg, Lauritzen and Tran2] introduces the notion of a $*$ -connecting path between two nodes in a DAG, which is similar to the notion of an active path between two nodes [Reference Koller and Friedman23, Definition 3.6].
4. Latent variables and parameter identifiability
In practice, it is possible that on some of the nodes, the variables of interest are not observed (latent). Examples from the literature are water heights at certain locations in the river networks of the Danube [Reference Asadi, Davison and Engelke3] and the Seine [Reference Asenova, Mazo and Segers4]. We look at the problem of recovering all parameters of the distribution of the complete vector, based on the distribution of the observed variables only. If this is possible, we can study the parametric model as if all variables were observed: in particular, we are able to compute measures of tail dependence for sets including the unobserved variables. The latter is important as it may be the only possible way to quantify tail dependence, because non-parametric estimates are not available when dealing with unobserved variables.
Consider for instance the network in Figure 4. The max-linear model on $\mathcal{T}=(V,E)$ has eight variables and eleven parameters $\theta=(c_e, e\in E)$ . By Proposition 4.1 below, the parameter $\theta \in \mathring{\Theta}_*$ can be uniquely identified in the case when $X_1, X_3, X_7$ are not observed, on the basis of the joint distribution of the remaining five variables, $X_U=(X_{2}, X_4,X_5,X_6,X_8)$ .
The problem of parameter identifiability will be formalized on the level of the angular measure $H_{\theta}$ and is presented in detail in the next two subsections.
4.1. Graph-induced characteristics of the angular measure
In this subsection, we argue that the condition $\theta=(c_e, e\in E)\in \mathring{\Theta}_*$ guarantees that all edge weights in $\theta$ are uniquely identifiable from the angular measure $H_{\theta}$ of $X = (X_v, v\in V)$ and thus from the distribution $P_\theta$ of X. Recall from (11) that $H_{\theta}$ is discrete with atoms $a_i = (a_{vi})_{v \in V} \in \Delta_V$ and masses $m_i > 0$ .
Thanks to the assumption $\theta\in \mathring{\Theta}_*$ , we have
For any DAG, all nodes have different sets of descendants, i.e.,
Indeed, if $i \ne j$ and $\mathrm{Desc}(i) \subseteq \mathrm{Desc}(j)$ , then $i \in \mathrm{desc}(j)$ and hence $j \not\in \mathrm{desc}(i)$ , so that $\mathrm{Desc}(j) \not\subseteq \mathrm{Desc}(i)$ .
Lemma 4.1. Let $(X_v, v \in V)$ follow a max-linear model as in Assumption 2.1, with parameter vector $\theta \in \mathring{\Theta}_*$ and induced coefficient matrix $(b_{vi})_{i,v \in V}$ . Let $H_\theta = \sum_{i \in V} m_i \delta_{a_i}$ in (11) be its angular measure. Then the following hold:
-
(1) We have $m_i > 0$ for all $i \in V$ .
-
(2) For any atom $a_i = (a_{vi})_{v \in V}$ , we have $a_{vi} > 0$ if and only if $v \in \mathrm{Desc}(i)$ . Specifically, all $|V|$ vectors $a_i$ are different, and every atom can be matched uniquely to a node in V.
-
(3) For each edge $(i, v) \in E$ , we have $c_{iv} = b_{vi} / b_{ii} = a_{vi} / a_{ii}$ .
In particular, $\theta \in \mathring{\Theta}_*$ is identifiable from $H_\theta$ and thus from $P_\theta$ ; i.e., for $\theta_1 \neq \theta_2\in \mathring{\Theta}_*$ we have $H_{\theta_1} \neq H_{\theta_2}$ and thus $P_{\theta_1} \neq P_{\theta_2}$ .
In Lemma 4.1, if the edge (i, v) is not critical, then there is another path, say p ′, from i to v with path product $c_{p'} \ge c_{iv}$ , and then we can further lower the value of $c_{iv}$ without changing the coefficients in (4), because they involve $c_{p'}$ rather than $c_{iv}$ , thus yielding the same measure $H_\theta$ . This shows that without the criticality assumption, some edge weights may not be identifiable from $H_\theta$ .
Example 4.1 (Unique zero patterns.)
In dimension $d = 3$ , consider an angular measure given by the following atoms and masses:
Consider the vectors $\beta_j = \mu_j \omega_j$ for $j \in \{1,2,3\}$ . By Lemma 4.1, the unordered collection $\{ \beta_1, \beta_2, \beta_3 \} = \{(0.8, 1, 0.4)^\top, (0, 0, 0.5)^\top, (0.2, 0, 0.1)^\top\}$ permits one to recover the values of the coefficients in the max-linear model
with (known) edge set $E = \{(2, 1), (1, 3)\}$ , and this is due to the presence of zeros in the vectors. For the current example, we argue as follows. The angular measure $H_{\theta}$ of $(X_1,X_2, X_3)$ has three atoms: atom $a_Z=b_Z/m_Z$ with $b_{Z}=(c_{11}, 0, c_{13}c_{11})^\top$ , atom $a_Y=b_Y/m_Y$ with $b_{Y}=(c_{21}c_{22},\, c_{22},\, c_{13}c_{21}c_{22})^\top$ , and atom $a_T=b_T/m_T$ with $b_{T}=(0,0,c_{33})^\top$ . As unordered sets, $\{\beta_1, \beta_2, \beta_3\}$ and $\{b_Z, b_Y, b_T\}$ are equal, but the question is which vector $\beta_j$ corresponds to which vector $b_{*}$ . From an inspection of the zero entries of the vectors, it is easily seen that the only possible way to identify the three coefficient vectors $\beta_1, \beta_2, \beta_3$ with the vectors $b_Z, b_Y, b_T$ of the angular measure $H_\theta$ is
Solving the equations yields $(c_{11}, c_{21}, c_{22}, c_{13}, c_{33})=(0.2, 0.8, 1, 0.5, 0.5)$ .
4.2. Identifiability issues with the angular measure of a subvector
When we deal with latent variables, we know the distribution of the observable variables only, $X_U=(X_v, v\in U)$ for non-empty $U \subset V$ . The angular measure, say $H_{\theta,U}$ , of $X_U$ in (12) is discrete and takes the form
with masses $\mu_r > 0$ and s distinct atoms $\omega_r \in \Delta_U$ . Combining (12) and (20), we should have
which means that, as sets, we should have $\{ \omega_1, \ldots, \omega_s \}=\{ a_{i,U} \;:\; i \in V \}$ . In contrast to the situation in Lemma 4.1, the subvectors $a_{i,U}$ for $i \in V$ are not necessarily all different. Any atom $\omega_r$ of $H_{\theta, U}$ is of the form $a_{i,U} = (b_{vi}/m_{i,U})_{v \in U}$ for one or possibly several indices $i \in V$ . For $r=1,\ldots, s$ and $i \in V$ such that $\omega_r = a_{i, U}$ , we know from (18) that
The (unordered) collection of vectors $\{(b_{vi})_{v\in U} \;:\; i \in V\}$ will be denoted by $\mathcal{B}_{\theta,U}$ .
With unobservable variables, there are several issues with the angular measure and the expression for it on the right-hand side of (21):
-
Zero masses. We have $m_{i,U}=\sum_{v\in U}b_{vi}$ , so that if all components of $(b_{vi})_{v\in U}$ are zero, then $m_{i,U}=0$ . This happens when $\mathrm{Desc}(i)\cap U=\varnothing$ . In this case, we have $s<|V|$ , i.e., $H_{\theta,U}$ has fewer atoms than $H_\theta$ .
-
Equal atoms. We may have $a_{i,U}=a_{j,U}$ for some indices $i,j\in V$ and $i\neq j$ . In this case, the terms i and j in (12) are to be aggregated, and again, $H_{\theta,U}$ has fewer than $|V|$ atoms, $s<|V|$ . This happens when the vectors $(b_{vi}, v\in U)$ and $(b_{vj}, v\in U)$ are proportional for some distinct $i,j\in V$ .
-
Zeros in the same positions. A more subtle problem occurs when for two distinct vectors $b,b'\in \mathcal{B}_{\theta,U}$ , the supports $\{v\in U\;:\;b_v>0\}$ and $\{v\in U\;:\;b'_v>0\}$ are equal. Such a situation arises when two distinct nodes $i,j\in V$ satisfy $\mathrm{Desc}(i)\cap U=\mathrm{Desc}(j)\cap U$ . The latter equality is possible only in the presence of latent variables and is to be contrasted with Property (19) when all variables are observable.
4.3. Identifiability criterion
For a max-linear model with respect to a TTT $\mathcal{T}=(V,E)$ with unique source, we need conditions that ensure that the minimal representation of the angular measure of $X_U$ is the one in (12). Consider the following two conditions for the set of nodes $\bar{U} = V \setminus U$ carrying latent variables:
-
(I1) Any $u \in \bar{U}$ has at least two children.
-
(I2) Any $u \in \bar{U}$ is the source of some tournament in $\mathcal{T}$ .
Proposition 4.1. Let X follow a max-linear model as in Assumption 2.1 with respect to a TTT $\mathcal{T}=(V,E)$ with unique source. For a non-empty node set $U \subset V$ , the parameter $\theta\in \mathring{\Theta}_*$ is uniquely identifiable from the distribution of $(X_v, v \in U)$ if and only if the conditions (I1) and (I2) are satisfied.
Figure 4 illustrates the identifiability criterion.
5. Discussion
In this paper we have considered a Bayesian max-linear network over a special type of graph which we call a tree of transitive tournaments (TTT). This is a graph which collects in an acyclic manner transitive tournaments which are themselves complete DAGs. The max-linear model is defined on a particular parameter space which ensures that the impact from one variable to another takes place along the shortest path, a consideration that has been defined in the literature as the path’s criticality. It turns out that a TTT with unique source leads to a graph without v-structures; that is, no node has non-adjacent parents. The limit of the scaled random vector, conditional on the event that a high threshold is exceeded at a particular node, is shown to be factorizable into independent multiplicative increments if and only if the TTT has a unique source. This result is analogous to those for Markov trees in [Reference Segers31] and for Markov random fields on undirected block graphs in [Reference Asenova and Segers5]. The feature that the Bayesian max-linear model on a TTT with unique source shares with these other two models is that it satisfies the global Markov property with respect to the undirected counterpart or skeleton graph of the TTT.
In addition, we have provided a simple necessary and sufficient criterion guaranteeing the identifiability of the edge coefficients in the case when some variables are latent. As suggested by a reviewer, it may be possible to extend the criterion to partial identifiability of some edge weights in the case when the criterion is fulfilled only locally.
With appropriate modifications, we expect the results presented in this paper to hold equally for the linear additive causal model introduced in [Reference Gnecco, Meinshausen, Peters and Engelke16]. One of the reasons is that the max-domain of attraction of a linear model with heavy-tailed factors is the same as that of a max-linear one [Reference Einmahl, Krajina and Segers10]. However, the relation between the edge weights $\theta = (c_e)_{e \in E}$ and the coefficient matrix $B_\theta = (b_{ij})_{i,j \in V}$ is different between the max-linear and additive linear versions, and this may require different approaches to showing the same properties for the additive version.
Appendix A. Trees of transitive tournaments
Recall that in a DAG, a v-structure refers to a node with parents that are not adjacent; see Figure 1.
A.1 Proof of Lemma 2.2
Proof. Part 1. Let $a,b\in V$ . If a and b share the same tournament, they must be connected by an arrow, which is then the unique shortest path between them, since all other possible paths have length larger than one.
Let a, b be non-adjacent. If there is a unique directed path between a and b, then this is the unique shortest path. Suppose now that there are two shortest paths: $p_1, p_2\in \pi(a,b)$ . Let the path $p_1$ be along the vertices $\{v_1=a, v_2, \ldots, v_n=b\}$ and the path $p_2$ along the vertices $\{u_1=a, u_2, \ldots, u_n=b\}$ .
We will proceed by contradiction. Assume $v_2\neq u_2$ . If $v_2$ and $u_2$ belong to two different tournaments, then there exists a non-directed cycle through nodes in different tournaments, namely $\{a,v_2, \ldots, b,\ldots, u_2,a\}$ . But this is impossible by Property (P2) of a TTT. Hence, $v_2$ and $u_2$ must belong to the same tournament, say $\tau_a$ , because a is part of the same tournament too. Now consider $u_3$ and $v_3$ . Then either $u_3=v_3$ or they share a tournament, say $\tau_3$ , because otherwise there exists a non-directed cycle through nodes in different tournaments. Since $(v_2, v_3)\in E$ and $(u_2, u_3)\in E$ , and by the assumption that $v_2\neq u_2$ , all four nodes $\{a, v_2, u_2, v_3=u_3\}$ or all five nodes $\{a, v_2, u_2, v_3, u_3\}$ belong to $\tau_a$ . This is because by Property (P1), two tournaments can share only one node, so it is impossible to have $\tau_3\cap \tau_a=\{v_2, u_2\}$ . Because all four or five nodes belong to the same tournament, and since $(a,v_2), (v_2,v_3), (a,u_2), (u_2, u_3)\in E$ , we must have $(a,v_3)\in E$ and $(a,u_3)\in E$ to avoid inter-tournament undirected cycles. Hence the paths $\{a=v_1, v_3, \ldots, v_n=b\}$ and $\{u_1=a, u_3, \ldots, u_n=b\}$ are shorter than $p_1$ and $p_2$ , a contradiction. We must therefore have $v_2 = u_2$ .
We apply the same strategy to the nodes $v_3, u_3$ and $v_4, u_4$ to find that $v_3 = u_3$ . Proceeding recursively, we conclude that $p_1 = p_2$ .
Part 2. First we show that if the TTT has a unique source, there cannot be a v-structure. We proceed by contraposition. Assume that there is a node, v, with parents in two different tournaments $\tau_a$ and $\tau_b$ . Let a and b be the sources of $\tau_a$ and $\tau_b$ respectively [Reference Harary and Moser18, Corollary 5a]. Note that we definitely have $v\neq a$ and $v\neq b$ . From node v we go to node a. If a does not have a parent from another tournament, we have found one node with zero in-degree with respect to the whole graph. If a has parent(s) from another tournament, say $\tau'_{\!\!a}$ , then we go to the node that within $\tau'_{\!\!a}$ has in-degree zero, say node a ′. We keep on going until we find a node with in-degree zero within the whole graph—such a node must exist because the graph is finite. We repeat the same process for $\tau_b$ , yielding two different nodes having zero in-degree with respect to whole graph. These nodes must be different because of the definition of $\mathcal{T}$ : since we started in two different tournaments $\tau_a$ and $\tau_b$ , we cannot end up in the same node; otherwise there would be a non-directed cycle passing through v and that node. Hence we have found two nodes with zero in-degree, which means that $\mathcal{T}$ does not have a unique source node.
Next we show that if $\mathcal{T}$ has two or more source nodes, u and v, then there is a v-structure. Because u and v are sources they have in-degree zero, so that they cannot belong to the same tournament, and thus they belong to two different tournaments. Consider the unique shortest trail between u, v on a sequence of nodes $\{u=v_1, v_2, \ldots, v_n=v\}$ . Such a trail exists because, by the definition of a TTT, the skeleton of $\mathcal{T}$ is a block graph, and in a block graph there is a unique shortest path between every two nodes [Reference Behtoei, Jannesari and Taeri6, Theorem 1]. For every two consecutive nodes in the shortest path, $v_i,v_{i+1}$ , we have either $(v_i,v_{i+1})\in E$ or $(v_{i+1}, v_i)\in E$ . Because u and v are sources of $\mathcal{T}$ , we have $(u,v_2)\in E$ and $(v,v_{n-1})\in E$ . Note that $n \ge 3$ , since u and v cannot be adjacent. We need three nodes $v_i, v_{i+1}, v_{i+2}$ such that $(v_i,v_{i+1})\in E$ and $(v_{i+2}, v_{i+1})\in E$ . If $n = 3$ , then the triple $(u,v_2,v)$ already fulfils the requirement. If $n\geq 4$ , then we continue from $v_2$ as follows. Let $i = \max \{ j = 1,\ldots,n-2 \;:\;(v_{j}, v_{j+1}) \in E\}$ ; then $(v_i,v_{i+1}) \in E$ and $(v_{i+2},v_{i+1}) \in E$ , as required. Because this is the shortest trail, $v_i$ and $v_{i+2}$ cannot belong to the same tournament, since otherwise there would exist a shorter trail passing only through $v_i$ and $v_{i+2}$ .
Part 3. Suppose that $v \in \mathrm{Desc}(i) \cap \mathrm{Desc}(j)$ but also both $i \not\in \mathrm{an}(j)$ and $j \not\in \mathrm{an}(i)$ ; in particular, i and j do not belong to the same tournament. Consider the paths p(i, v) and p(j, v). Along each path, continue walking upwards considering successive parents. Since the graph is finite, this walk must end for both paths at a node without parents. By assumption, the latter must be the unique source node of the TTT, say $u_0$ . We will thus have found two different paths from $u_0$ to v, one passing via i and the other one via j. However, as i and j do not belong to the same tournament, this is in contradiction to Property (P2) of a TTT.
A.2. Proof of Lemma 2.3
Proof. Part 1. Suppose that there is a node $v_r$ , for $r \in \{2, \ldots, n-1\}$ , which is not the source node in the tournament shared with $v_{r+1}$ , say $\tau$ . Let $\bar{v}$ be a parent of $v_r$ in $\tau$ . Note that $\bar{v}$ must be a parent of $v_{r+1}$ too, because of the out-degree ordering in a tournament. Because $v_{r-1}$ is a parent of $v_r$ too, both $v_{r-1}$ and $\bar{v}$ must belong to $\tau$ , since otherwise $v_r$ would have parents from different tournaments, which is impossible according to Lemma 2.2-2. Hence $v_{r-1}, v_r, \bar{v}, v_{r+1}$ all belong to the same tournament, i.e., to $\tau$ . Necessarily $v_{r-1}$ is a parent of $v_{r+1}$ , because otherwise there would be a directed cycle $\{v_{r-1},v_r, v_{r+1}, v_{r-1}\}$ . But then $\{v_1, \ldots, v_{r-1}, v_{r+1}, \ldots, v_n\}$ is a shorter path between $v_1$ and $v_n$ , in contradiction to the hypothesis.
Part 2. Let the shortest trail between u and v be the one along the node sequence $\{v_1=u, \ldots, v_n=v\}$ . It is sufficient to show that there cannot exist a node $v_{r}$ for $r \in \{2, \ldots, n-1\}$ such that $(v_{r-1}, v_r)\in E$ and $(v_{r+1},v_r)\in E$ . Indeed, suppose to the contrary that there exists $r \in \{2, \ldots, n-1\}$ such that both $v_{r-1}$ and $v_{r+1}$ are parents of $v_r$ . Then $v_{r-1}$ and $v_{r+1}$ must be adjacent, because v-structures are excluded by Lemma 2.2-2. But then $\{v_1, \ldots, v_{r-1}, v_{r+1}, \ldots, v_n\}$ is a shorter trail between u and v, yielding a contradiction.
Appendix B. Proofs and additional results for Section 3
Proof of Lemma 3.1. From [Reference Segers31, Example 1] we have the limit
Adapting this representation to a model where we have $b_{uj}=0$ for $j\notin \mathrm{An}(u)$ and $b_{ij}=c_{p(j,i)}b_{jj}$ for $j\in \mathrm{An}(i)$ , we obtain
Recall that $c_{p(i,i)}=1$ and $c_{p(i,j)}=0$ if $i\notin \mathrm{An}(j)$ .
Next we show that $(M_{uv}, v\in V_{\tau})$ are mutually dependent. When u is the source of $\tau$ , for every $j\in \mathrm{An}(u)$ the atom
has probability $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ . Hence $(M_{uv}, v\in V_{\tau})$ are at the same time perfectly dependent and independent.
For a node u which is not the source node, the general idea is to take a collection of coordinates for which the joint probability is zero while the product of the marginal probabilities is positive, thus showing that the joint probability does not equal the product of the marginal probabilities for the selected possible value of the random vector.
For brevity, let $V_{\tau}=\{1,2,\ldots,m \}$ , with the nodes labelled according to the order of their out-degrees within $\tau$ : the source node of $\tau$ has out-degree $m-1$ (the largest) and is labelled as 1, the node with out-degree $m-2$ is labelled as 2, etc.
Suppose u is the node 2. Thanks to the no-cycle property within a tournament, we have $\mathrm{An}(2)=\mathrm{An}(1)\cup \{2\}$ . For all $j\in \mathrm{An}(1)$ we have
which is an atom of $(M_{2v},v=1,\ldots, m)$ with mass $\sum_{j\in \mathrm{An}(1)}b_{2j}$ . This means that for the marginal distribution of $M_{21}$ we have $\operatorname{\mathbb{P}}(M_{21}=1/c_{12})\geq\sum_{j\in \mathrm{An}(1)}b_{2j}$ . For $j=2$ we have an atom $(0, 1, c_{23}, \ldots, c_{2m})$ with mass $b_{22}$ . This means that for the marginal probabilities of $(M_{23}, \ldots, M_{2m})$ we have $\operatorname{\mathbb{P}}(M_{2v}=c_{2v})\geq b_{22}$ for all $v=3, \ldots, m$ . Take a vector of coordinates $(1/c_{12}, 1, c_{23}, \ldots, c_{2m})$ . Note that this vector cannot be the same as the one in (23). For any $v=3, \ldots, m$ we cannot have $c_{1v}/c_{12}=c_{2v}$ because of the criticality assumption, according to which $c_{1v} > c_{12}c_{2v}$ for any $v = 3,\ldots,m$ . The joint probability of this vector of coordinates is
However, the product of the marginal probabilities is positive:
Now let $u\geq 3$ . Take the vector of coordinates in (17) corresponding to $j=1$ , which is equal to $(1/c_{1u},c_{12}/c_{1u}, \ldots, c_{1m}/c_{1u} )$ and has probability at least $b_{u1}$ . Consider also the vector of coordinates for $j=u$ , which is $(0, \ldots, 0, 1;\; c_{uv}, v=u+1, \ldots, m)$ with mass at least $b_{uu}$ . Replace the first coordinate by $1/c_{1u}$ . The vector obtained in this way has joint probability zero. For every $j\in \mathrm{pa}(u)$ we have $b_{vj}/b_{uj}=0$ when v is not a child of j, or equivalently, given the order in the node labelling, when $v<j$ . So for fixed $u\geq3$ , for $j=1$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has no zeros. For $j=2$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has one zero, namely $(0;b_{vj}/b_{uj}, v=2, \ldots, m)$ ; for $j=3$ the vector $(b_{vj}/b_{uj}, v=1, \ldots, m)$ has two zeros, namely $(0,0;b_{vj}/b_{uj}, v=3, \ldots, m)$ ; and so on until $j=u$ with the corresponding vector $(b_{vj}/b_{uj}, v=1, \ldots, m)=(0,\ldots, 0;b_{vj}/b_{uj}, v=u, \ldots, m)$ . If we replace the first coordinate by a non-zero value in this vector, we get an impossible value for the random vector $(M_{uv}, v=1, \ldots, m)$ or a value with probability zero. Considering the univariate marginal distributions of $(M_{uv}, v=1, \ldots, m)$ , we obtain for the product of the marginal probabilities a positive value:
This shows that for any $u\in V_{\tau}$ the vector $(M_{u1}, \ldots, M_{um})$ has jointly dependent elements.
Next we show the distribution of a single element $M_{uv}, v\in V_{\tau}\setminus u$ :
-
1. Consider first the case when u is the source node in $\tau$ . Since $(u, v) \in E$ , we have $\mathrm{An}(u) \subset \mathrm{An}(v)$ and thus $\mathrm{An}(v) \cap \mathrm{An}(u) = \mathrm{An}(u)$ . We have $b_{vj}>0$ , $j\in \mathrm{An}(u)$ ; hence zero is not a possible value of $M_{uv}$ . For $j\in \mathrm{An}(u)$ ,
\[ \frac{b_{vj}}{b_{uj}}=\frac{c_{p(j,u)}c_{uv}b_{jj}}{c_{p(j,u)}b_{jj}}=c_{uv}, \]and since $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ we obtain the desired result under 1.(a).When u is not the source node in $\tau$ , not all shortest paths to v pass through u; hence for $j\in\mathrm{An}(u)$ we have
\[ \frac{b_{vj}}{b_{uj}}=\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,u)}} >0 \]with mass $b_{uj}$ . This yields the result in 1.(b). Note that zero is not a possible value, as we still have $\mathrm{An}(u)\subset \mathrm{An}(v)$ . Also, $c_{p(u,u)}=1$ by convention. -
2. Let us now consider $(v,u)\in E$ . In this case $\mathrm{An}(u)\setminus\mathrm{An}(v)$ is not empty, because it contains at least the node u, so zero is a possible value of $M_{uv}$ . We need to distinguish only the zero atoms from the non-zero ones. When v is a source node in $\tau$ , for $j\in \mathrm{An}(v)$ we have
\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,v)}c_{vu}} =\frac{1}{c_{vu}} >0, \]which is an atom with probability\[ \sum_{j\in \mathrm{An}(v)}b_{uj} =\sum_{j\in \mathrm{An}(v)}c_{p(j,v)}c_{vu}b_{jj} =c_{vu}\sum_{j \in \mathrm{An}(v)} c_{p(j,v)} b_{jj} =c_{vu}\sum_{j \in \mathrm{An}(v)} b_{vj} = c_{vu}. \]The probability of the zero atom is $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}=\sum_{j\in \mathrm{An}(u)}b_{uj}-\sum_{j\in \mathrm{An}(v)}b_{uj}=1-c_{vu}$ . This shows 2.(a).When v is not a source node of $\tau$ , we have for $j\in \mathrm{An}(v)$
\[ \frac{b_{vj}}{b_{uj}} =\frac{c_{p(j,v)}b_{jj}}{c_{p(j,u)}b_{jj}} =\frac{c_{p(j,v)}}{c_{p(j,u)}}>0, \]an atom with mass $b_{uj}$ and zero atom with probability $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}$ . This shows 2.(b).
Remark B.1. From the results in Lemma 3.1 we see that a multiplicative increment does not have a degenerate distribution at zero, so that a product of several such multiplicative increments cannot be degenerate at zero either. This is an important observation that we will use in further proofs.
Lemma B.1. Let $(X_v, v\in V)$ follow a max-linear model as in Assumption 2.1. Let $\mathcal{T}$ have a unique source. For any $u\in V$ we have
The distribution of $A_{uv}$ depends on the three types of possible trails according to Lemma 2.3-2. In what follows we assume $(u,v)\notin E$ . For the case $(u,v)\in E$ see Lemma 3.1.
-
1. Distribution of $A_{uv}$ on a path $\{u=v_1, r=v_2, \ldots, v=v_n\}$ with $u,r\in \tau$ , one of the tournaments of $\mathcal{T}$ :
-
(a) If u is a source node in $\tau$ , then $\mathcal{L}(A_{uv})=\delta_{\{c_{p(u,v)}\}}$ .
-
(b) If u is not a source node in $\tau$ , we have
\[ \mathcal{L}(A_{uv}) =\sum_{j\in \mathrm{An}(u)} b_{uj}\delta_{\left\{\frac{c_{p(j,r)}}{c_{p(j,u)}}c_{p(r,v)}\right\}}. \]
-
-
2. Distribution of $A_{uv}$ on a path $\{v=v_1, r=v_2, \ldots, u=v_n\}$ with $v,r\in \tau$ :
-
(a) If v is a source node in $\tau$ , then
\[ \mathcal{L}(A_{uv}) =c_{p(v,u)}\delta_{\left\{\frac{1}{c_{p(v,u)}}\right\}} +(1-c_{p(v,u)})\delta_{\{0\}}. \] -
(b) If v is not a source node in $\tau$ , then
\[ \mathcal{L}(A_{uv}) =\sum_{j\in \mathrm{An}(v)} c_{p(r,u)}b_{rj}\delta_{\left\{ \frac{c_{p(j,v)}}{c_{p(j,r)}c_{p(r,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(v)}b_{uj}\delta_{\{0\}}. \]
-
-
3. Distribution of $A_{uv}$ on a trail composed of two paths p(r,u) and p(r,v): Let the trail be on nodes $\{u, \ldots, m,r, n, \ldots, v\}$ . Also, let $\tau_m, \tau_n$ be two tournaments with $r,m\in \tau_m$ and $r,n\in \tau_n$ .
-
(a) If r is a source in both $\tau_m$ and $\tau_n$ , then
\[ \mathcal{L}(A_{uv})=c_{p(r,u)} \delta_{\left\{\frac{c_{p(r,v)}}{c_{p(r,u)}}\right\}} +(1-c_{p(r,u)})\delta_{\{0\}}. \] -
(b) If r is a source in $\tau_m$ , but not in $\tau_n$ , then
\[ \mathcal{L}(A_{uv})=\sum_{j\in\mathrm{An}(r)}c_{p(r,u)}b_{rj} \delta_{\left\{\frac{c_{p(j,n)}c_{p(n,v)}}{c_{p(j,r)}c_{p(r,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}\delta_{\{0\}}. \] -
(c) If r is a source in $\tau_n$ , but not in $\tau_m$ , then
\[ \mathcal{L}(A_{uv})=\sum_{j\in\mathrm{An}(r)}c_{p(m,u)}b_{mj} \delta_{\left\{\frac{c_{p(j,r)}c_{p(r,v)}}{c_{p(j,m)}c_{p(m,u)}}\right\}} +\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}\delta_{\{0\}}. \]
-
Proof. We have already seen that from [Reference Segers31, Example 1] we have the limit
Adapting this representation to a model where we have $b_{uj}=0$ for $j\notin \mathrm{An}(u)$ and $b_{ij}=c_{p(j,i)}b_{jj}$ for $j\in \mathrm{An}(i)$ , we obtain
Recall that $c_{p(i,i)}=1$ and $c_{p(i,j)}=0$ if $i\notin \mathrm{An}(j)$ . For a single $v\in V\setminus u$ we have the marginal distribution
The distribution of $A_{uv}$ depends deterministically on properties of the TTT. When $\mathcal{T}$ has a unique source, according to Lemma 2.3-2 there are three possible shortest trails between two nodes. In addition we have also the property under Lemma 2.3-1. We look at the different distributions of $A_{uv}$ that arise from these two properties of the TTT.
First we deal with 1.(a). Since $\mathrm{An}(u)\subset \mathrm{An}(v)$ , all atoms in (25) are positive and zero is not a possible value of $A_{uv}$ . All paths from $\mathrm{An}(u)$ to v pass through u, because u is a source in $\tau$ and because, by Property (P2) of a TTT, no cycle involving several tournaments is allowed. The case is illustrated by the graph below.
Hence for all $j\in \mathrm{An}(u)$ we have
with mass $\sum_{j\in \mathrm{An}(u)}b_{uj}=1$ .
Next we show 1.(b). Because $\mathrm{An}(u)\subset \mathrm{An}(v)$ , zero is not a possible value of $A_{uv}$ . Not all shortest paths from $\mathrm{An}(u)$ to v pass through u, because u is not a source in $\tau$ . However, all paths from $\mathrm{An}(u)$ to v pass through r, as shown in the picture. Paths from $\mathrm{An}(u)$ to v other than these passing through u or r are impossible because of Property (P2) of a TTT.
For $j\in \mathrm{An}(u)$ we have
with mass $b_{uj}$ , hence the expression in 1.(b).
Next we show 2.(a). When the directed path is from v to u, the set $\mathrm{An}(u)\setminus \mathrm{An}(v)$ contains at least u; hence we have $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(v)$ . This means that zero is a possible value of $A_{uv}$ . All shortest paths from $j\in \mathrm{An}(v)$ to u pass through v, as v is a source in $\tau$ . Otherwise, there would be a cycle involving multiple tournaments, which is not allowed under Property (P2) of a TTT.
For $j\in \mathrm{An}(v)$ , the non-zero atom is given by
with mass
For the zero atom we have probability
This shows 2.(a).
To show 2.(b) we note that when v is not a source node of $\tau$ , not all shortest paths from $j\in \mathrm{An}(v)$ to u pass through v. However, all paths from $j\in \mathrm{An}(v)$ to u pass through r, as can be seen from the figure below.
Hence for $j\in \mathrm{An}(v)$ we have
which is an atom with mass $b_{uj}=c_{p(j,r)}c_{p(r,u)}b_{jj}=c_{p(r,u)}b_{rj}$ . The zero atom comes from the fact that $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(v)$ , and it has probability
This shows the distribution under 2.(b).
By Lemma 2.3-1, the node r, as part of the path p(r, u), is allowed not to be a source node in $\tau_m$ . A similar statement can be made in relation to the path p(r, v). However, when we combine p(r, u) and p(r, v) in one trail t(u, v), the node r should be a source in at least one of $\tau_m$ and $\tau_n$ . Indeed, if r were not a source either in $\tau_m$ or in $\tau_n$ , then there would be a v-structure. However, Lemma 2.2-2 excludes v-structures when $\mathcal{T}$ has a unique source; hence node r should be a source in at least one of the tournaments $\tau_m$ and $\tau_n$ .
To show 3.(a), we note that all paths from $j\in \mathrm{An}(r)$ to u and to v pass through r, as r is a source in both $\tau_n$ and $\tau_m$ . The case is depicted in the following picture.
Also, we have $b_{vj}=0$ for all $j\in \mathrm{An}(u)\setminus \mathrm{An}(r)$ . For $j\in \mathrm{An}(r)$ we have
with probability
The probability of the zero atom is
Next we show 3.(b). Because r is not a source in $\tau_n$ , not all paths from $\mathrm{An}(r)$ to v pass through r, but they do all pass through n. Also, all paths from $\mathrm{An}(r)$ to u pass through r, because r is a source in $\tau_m$ .
Hence, for $j\in \mathrm{An}(r)$ ,
which is an atom with mass $b_{uj}=c_{p(j,r)}c_{p(r,u)}b_{jj}=b_{rj}c_{p(r,u)}$ . The zero atom has probability equal to $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}$ .
Next we show 3.(c). When r is a source in $\tau_n$ , this means that all paths from $\mathrm{An}(r)$ to v pass through r. Because r is not a source in $\tau_m$ , not all paths from $\mathrm{An}(r)$ to u pass through r, but they do all pass through m.
For $j\in \mathrm{An}(r)$ we have
which is an atom with mass $b_{uj}=c_{p(j,m)}c_{p(m,u)}b_{jj}=b_{mj}c_{p(m,u)}$ . The zero atom comes from $b_{uj}=0$ for all $j\in\mathrm{An}(u)\setminus \mathrm{An}(r)$ . It has probability $\sum_{j\in \mathrm{An}(u)\setminus \mathrm{An}(r)}b_{uj}$ .
Proof of Proposition 3.1. First we prove that (i) implies (ii). Assume $\mathcal{T}$ has a unique source. We have to prove that for any $u\in V$ , an element from the limiting vector in (15) is given by (16).
In Lemma B.1 we have seen a number of cases for the distribution of $A_{uv}$ , depending on deterministic properties of the trail between u and v. Below we consider each of these cases again.
Case 1. Let the unique shortest trail between u and v be a path on the node sequence $\{u=v_1, r=v_2, \ldots, v_n=v \}$ . Let $\tau$ be the tournament containing u, r.
Case 1. (a). Let u be the source in $\tau$ . From Lemma B.1-1.(a) we have $P(A_{uv}=c_{p(u,v)})=1$ . Consider the variables $(M_e, e\in p(u,v))$ , which are by construction independent of each other, because they belong to different tournaments. Note that in this case every node $v_1,\ldots, v_{n-1}$ is the source node in the tournament containing that node and the next one in the sequence. This follows from Lemma 2.3-1. Then, by Lemma 3.1 1.(a), for every $M_e, e\in p(u,v)$ we have $\operatorname{\mathbb{P}}(M_e=c_e)=1$ , and hence
which shows $A_{uv}=\prod_{e\in p(u,v)}M_e$ .
Case 1. (b). If u is not the source in $\tau$ , the distribution of $M_{ur}$ is as in Lemma 3.1-1.(b). As in Case 1.(a), every node $r=v_2,v_3,\ldots, v_{n-1}$ is the source node in the tournament containing that node and the next one in the sequence. The variables $M_e, e\in p(r,v)$ are degenerate at $c_e$ . As in Case 1.(a), the variables $(M_e, e\in p(u,v))$ are by construction independent of each other, because they are indexed by edges which belong to different tournaments. Then we have
The symbol $\otimes$ denotes multiplication between two discrete probability measures, say $\mu$ and $\nu$ , of two independent variables, say $\xi_1$ and $\xi_2$ , respectively. For two possible values $a_1,a_2$ of $\xi_1, \xi_2$ respectively, we have $\mu(\{a_1\})\nu(\{a_1\})$ as a measure of the event $\{\xi_1\xi_2=a_1a_2\}=\{\xi_1=a_1, \xi_2=a_2\}$ . The last expression in (26) is the distribution of $A_{uv}$ in Lemma B.1-1.(b).
Case 2. Let the unique shortest trail between u and v be a path from v to u on the node sequence $\{v=v_1, r=v_2, \ldots, v_n=u \}$ . Let $\tau$ be the tournament containing v, r.
Case 2. (a). Let v be the source in $\tau$ . Consider the random variables $M_{v_{i+1}v_{i}}$ , $i=1,\ldots, n-1,$ whose distributions are as in Lemma 3.1-2.(a). Since this is the unique shortest trail from v to u, all edges on it belong to different tournaments, and the vector $(M_{v_{i+1}v_{i}}, i=1,\ldots, n-1)$ contains independent variables by definition. Then
For the zero atom we have