1. Introduction
The optimal reinsurance problem has been a popular topic since the seminal work of Borch (Reference Borch1960) and Arrow (Reference Arrow1963). Especially after the introduction of coherent risk measure in Artzner et al. (Reference Artzner, Delbaen, Eber and Heath1999) and convex risk measure in Frittelli and Rossaza Gianin (Reference Frittelli and Rossaza Gianin2002) and Föllmer and Schied (Reference Föllmer and Schied2002), the classical optimal reinsurance problem based on a risk measure has been widely studied under various choices of risk measures and different constraints on premiums; see for example Cai and Tan (Reference Cai and Tan2007), Chi and Tan (Reference Chi and Tan2011), Cui et al. (Reference Cui, Yang and Wu2013), Cheung et al. (Reference Cheung, Sung, Yam and Yung2014) and Cai et al. (Reference Cai, Lemieux and Liu2016). See Cai and Chi (Reference Cai2020) for a review of optimal reinsurance designs based on risk measures.
In a classical reinsurance problem, the distribution of a loss is assumed to be precisely known. However, oftentimes only partial information is available for the loss distribution due to the lack of data and estimation error in practice. Recently, including model uncertainty in evaluation of a risk and reinsurance design has drawn increasing attention. Generally, model uncertainty is described by an uncertainty set. Two common ways are the momentbased uncertainty set and the distancebased uncertainty set. The former considers distributions satisfying certain constraints on moments while the latter considers distributions that are within a distance from a reference distribution. The introduction of uncertainty in evaluation of a risk motivates the study of the worstcase risk measure. For instance, El Ghaoui et al. (Reference El Ghaoui, Oks and Oustry2003) studied the worstcase ValueatRisk (VaR) and obtained a closedform solution for the worstcase VaR over an uncertainty set that contains distributions with known mean and variance. Natarajan et al. (Reference Natarajan, Sim and Uichanco2010) showed that the worstcase Conditional ValueatRisk (CVaR) for the same uncertainty set in El Ghaoui et al. (Reference El Ghaoui, Oks and Oustry2003). In addition, Li (Reference Li2018) extended those results to a general class of law invariant coherent risk measures. See Schied et al. (Reference Schied, Föllmer, Weber, Ciarlet, Bensoussan and Zhang2009) for a review of robust preferences as a robust approach to the problem of model uncertainty. In reinsurance design, Hu et al. (Reference Hu, Yang and Zhang2015) studied optimal reinsurance with stoploss contracts and incomplete information on the loss distribution in the sense that only the first two moments of the loss are known. See Pflug et al. (Reference Pflug2017), Birghila and Pflug (Reference Birghila and Pflug2019) and Gavagan et al. (Reference Gavagan, Hu, Lee, Liu and Weixel2022) for the design of an optimal insurance policy with the uncertainty set defined by the Wasserstein distance. Asimit et al. (Reference Asimit, Bignozzi, Cheung, Hu and Kim2017) considered model uncertainty in insurance contract design by maximizing over a finite set of probability measures.
In this paper, we study a distributionally robust optimal reinsurance problem with a risk measure called expectile. Expectiles, introduced in Newey and Powell (Reference Newey and Powell1987) as the minimizers of an asymmetric quadratic loss function in the context of regression, are gaining increasing popularity in econometric literature (e.g. Kuan et al. Reference Kuan, Yeh and Hsu2009) and actuarial science (e.g. Bellini et al. Reference Bellini, Klar, Müller and Gianin2014 and Cai and Weng Reference Cai and Weng2016). Bellini et al. (Reference Bellini, Klar, Müller and Gianin2014) showed that expectile is a coherent risk measure under certain conditions, and it is robust in the sense of lipschitzianity with respect to the Wasserstein metric. We assume that the distribution of a loss is partially known in the sense that the mean and variance of the loss are known. The distributionally robust optimal reinsurance problem we study is a minimax problem, where the inner problem is a maximization of the total retained loss over all distributions with known mean and variance and the outer problem is a minimization problem over all possible stoploss reinsurance contracts. The main idea of solving the inner problem is to show that the inner problem is equivalent to optimization over all threepoint distributions with known mean and variance, thus we reduce the infinitedimensional optimization problem to a finitedimensional optimization problem. At first glance, this conclusion seems similar to the one obtained in Liu and Mao (Reference Liu and Mao2022) for solving a distributionally robust reinsurance problem with VaR and CVaR. However, the proof of the main result is different from that in Liu and Mao (Reference Liu and Mao2022) because an expectile with levels different from $1/2$ does not admit an explicit formula based on the distribution function as for VaR or CVaR. In addition, in contrast to Liu and Mao (Reference Liu and Mao2022), we did not obtain a closedform solution to the reinsurance problem based on expectile, but came up with a finitedimensional optimization problem. The main contribution of this paper is that we show that the worstcase distribution is among threepoint distributions, which reduces the infinitedimensional optimization problem to a finitedimensional optimization problem. We emphasize that our main results appear nontrivial as the classical minimax theorem or duality cannot apply directly to the problem and a new technique is needed to obtain the main result.
The rest of the paper is organized as follows. In Section 2, the definition and properties of an expectile are given, and we present our distributionally robust reinsurance problem as a minimax problem. Section 3 aims to tackle the inner problem of the minimax problem. Proofs of the main results are given in Section 4. Numerical examples are given in Section 5 to study the impacts of the parameters on the optimal solution. Concluding remarks are given in Section 6.
2. Expectile and problem formulation
2.1. Expectile
Expectile, first introduced in Newey and Powell (Reference Newey and Powell1987) as the minimizer of an asymmetric quadratic loss function in the context of regression, is defined as follows.
Definition 1. The $\alpha$ expectile of a loss random variable X with ${\mathbb{E}}[X^2]<\infty$ at a confidence level $\alpha \in (0, 1)$ , denoted by $e_\alpha(X)$ , is defined as the unique minimizer of the following problem:
where $(x)_+\;:\!=\;\max\{x, 0\}$ .
Being the minimizer of a weighted mean squared error, an expectile has the property of elicitability, which is desirable as a risk measure has to be estimated from historical data and an elicitable risk measure makes it possible to verify and compare competing estimation procedures (e.g. Gneiting Reference Gneiting2011; Kratz et al. Reference Kratz, Lok and McNeil2018 and Bettels et al. Reference Bettels, Kim and Weber2022). Built on Weber (Reference Weber2006), Bellini and Bignozzi (Reference Bellini and Bignozzi2015) provided a full characterization of all elicitable monetary risk measures. See Bellini et al. (Reference Bellini, Klar, Müller and Gianin2014), Ziegel (Reference Ziegel2016), Embrechts et al. (Reference Embrechts, Mao, Wang and Wang2021) and references therein for more discussions on the elicitability of risk measures and related properties. The following proposition is a list of properties of expectiles given in Bellini et al. (Reference Bellini, Klar, Müller and Gianin2014) and Cai and Weng (Reference Cai and Weng2016).
Proposition 1. Let X be a loss random variable with ${\mathbb{E}}[X^2]<\infty$ and $e_\alpha(X)$ be the $\alpha$ expectile of X, $\alpha \in (0, 1)$ . Then

(i) A number $e_\alpha(X)\in{\mathbb{R}}$ solves optimization problem (2.1) if and only if
(2.2) \begin{align}\alpha \mathbb{E}\left[(Xe_\alpha(X))_+\right]=(1\alpha) \mathbb{E}\left[(e_\alpha(X)X)_+\right].\end{align} 
(ii) The expectile $e_\alpha(X)$ is a coherent risk measure if $\alpha\geqslant 1/2$ .

(iii) $e_\alpha(X)\leqslant {\mathrm{ess\mbox{}sup}} \;X$ .

(iv) $e_\alpha(X)={\mathbb{E}}[X]+\beta{\mathbb{E}}[(X e_\alpha(X))_+]$ with $\beta=\frac{2\alpha1}{1\alpha}$ .
Proposition 1(iv) implies that $e_\alpha(X)\leqslant {\mathbb{E}}[X]$ for $\alpha\leqslant 1/2$ and $e_\alpha(X)\geqslant {\mathbb{E}}[X]$ for $\alpha\geqslant 1/2$ . For the purpose of risk management in insurance and finance, a risk measure of a loss random variable, as a tool of calculating premium or regulatory capital requirement, is normally required to be larger than the expected loss. In addition, the expectile is a coherent risk measure for $\alpha\ge1/2$ , possessing the subadditivity property, which is a natural requirement to meet that “a merger does not create extra risk”. Therefore, throughout this paper, we are interested in the case of $\alpha>1/2$ , and we will also show later that the reinsurance problem is trivial for $\alpha\leqslant 1/2$ (see Proposition 2).
2.2. Distributionally robust reinsurance with expectile
Let $X $ be a nonnegative groundup loss faced by an insurer. The insurer transfers part of the loss, say $I(X)$ , to a reinsurer at the cost of paying reinsurance premium. The reinsurance premium is considered as a function of the reinsurance contract $I(X)$ , denoted by $\pi\left(I(X)\right)$ . In a reinsurance contract, the function $I(\cdot)$ is called a ceded loss function. After purchasing the reinsurance contract $I(X)$ , the total retained risk exposure of the insurer is $XI(X)+\pi\left(I(X)\right)$ . In this paper, we determine the optimal ceded loss function or reinsurance contract from the insurer’s perspective instead of the reinsurer’s standpoint.
In a classical reinsurance problem, the distribution of the groundup loss X is assumed precisely known. The aim of a classical reinsurance problem is to find an optimal reinsurance contract so that the risk measurement of the total retained risk exposure of the insurer is minimized, that is
where $\rho$ is a risk measure and $\mathcal{I}$ is a set of candidate reinsurance contracts. See Cai and Chi (Reference Cai2020) for a review of classical optimal reinsurance designs with risk measures.
In this paper, we consider a distributionally robust optimal reinsurance problem in which the cumulative distribution function (cdf) of the groundup loss X is not completely known. Throughout the paper, we assume that the distribution of the groundup loss is partially known in the sense that only the mean and variance of X are known. Given a pair of nonnegative mean and standard deviation $(\mu, \sigma)$ of X, define the uncertainty set:
Let $\mathcal{I}$ be the class of stoploss reinsurance contracts. A stoploss reinsurance contract I(X) is defined as $I(X)=(Xd)_+$ , $d\in[0, \infty]$ , where d is called a deductible. By convention, $(X\infty)_+=0$ . Borch (Reference Borch1960) showed that a stoploss reinsurance is optimal when the insurer minimizes the variance of its total retained risk exposure with the premium computed under the expected value premium principle. Besides, Arrow (Reference Arrow1963) showed that a stoploss reinsurance is also optimal if the insurer maximizes his/her expected utility of its terminal wealth under the expected value premium principle. A similar conclusion was also obtained in Cheung et al. (Reference Cheung, Sung, Yam and Yung2014) under lawinvariant convex risk measures. Furthermore, stoploss reinsurance is popular in practice. Thus, we consider stoploss reinsurance contracts as our candidate reinsurance contracts. A common premium principle is the expected value premium principle, which is defined as $\pi\left(I(\cdot)\right)=(1+\theta){\mathbb{E}}[I(\cdot)]$ for a reinsurance contract $I\in\mathcal{I}$ , where $\theta>0$ is called a safety loading factor. We are interested in the following distributionally robust reinsurance problem with the risk measure expectile and under expected value premium principle:
where $\alpha> 1/2$ and the superscript F indicates that the expectile and the premium are calculated with X following the distribution F. With $I(X)=(Xd)_+$ , the total retained risk exposure of the insurer is $XI(X)+\pi\left(I(X)\right)=X\wedge d+(1+\theta){\mathbb{E}}[ (Xd)_+]$ , where $x\wedge y\;:\!=\;\min\{x, y\}$ . Furthermore, by the translation invariance of a coherent risk measure, problem (2.4) can be reduced to
A distribution $F \in S(\mu, \sigma)$ that solves the inner problem of (2.5) is called the worstcase distribution. Notably, if $\alpha\leqslant 1/2$ , we can show that the objective function $e_{\alpha}^F[X \wedge d]+(1+\theta){\mathbb{E}}^F[(Xd)_+]$ is always decreasing in d, and thus, the optimal deductible of problem (2.5) is $d^*=\infty$ .
Proposition 2. For $\alpha\leqslant 1/2$ , we have the optimal deductible of problem (2.5) is $d^*=\infty$ .
Proof. Denote by $g^F(d) \;:\!=\; e_{\alpha}^F(X \wedge d)+(1+\theta){\mathbb{E}}^F[(Xd)_+]$ , $d\in{\mathbb{R}}$ , and it suffices to show that $g^F(d)$ is decreasing in $d\geqslant 0$ , which obviously implies $\sup_{F\in \mathcal S(\mu,\sigma)}g^F(d)$ is decreasing in $d\geqslant 0$ . By definition of $e_{\alpha}^F[X \wedge d]\;=\!:\;x_d$ in (2.2), we have $x_d$ satisfies
where $\overline{F}(y)=1F(y)$ . Taking (left)derivative with respect to d yields
Noting that ${\partial {\mathbb{E}}[(Xd)_+]}/{\partial d} = \overline{F}(d) $ , we have
where the inequality follows from that $\alpha\leqslant 1/2$ . Thus, we have $g^F(d)$ is decreasing in $d\geqslant 0$ which completes the proof.
The distributional robust reinsurance problem is trivial for $\alpha\leqslant 1/2$ and the optimal deductible is $d^*=\infty$ by Proposition 2. Therefore, in the rest of this paper, we only need to consider the case $\alpha>1/2$ . In the next section, we will first solve the inner problem of (2.5) for $\alpha>1/2$ , that is, we work on the worstcase distribution of the inner problem of (2.5).
Remark 1. It is tempting to use the minimax theorem to tackle problem (2.5) since both $e^F_\alpha(X)$ and ${\mathbb{E}}^F[(Xd)_+]$ are quasilinear in $F$ (which does not imply that the objective function as a whole is quasilinear in $F$ ). However, the quasiconvexity or quasiconcavity of the objective function with respect to $(X,d)$ or $(F,d)$ for problem (2.5) cannot be established since $e_\alpha$ is convex in $d$ but the functional $X\wedge d$ is concave in $d$ . Therefore, the minimax theorem and duality of the optimization problem cannot apply directly to problem (2.5).
3. Main results
3.1. The worstcase distribution
In this section, we focus on tackling the inner problem of (2.5) for $\alpha>1/2$ , that is,
We aim to show that the worstcase distribution of the optimization problem (3.1) must be a threepoint distribution if it exists, that is, it belongs to the following uncertainty set
Here, we make the convention that twopoint distribution and point mass distributions are special cases of threepoint distributions. The following theorem states that the worstcase distribution to problem (3.1) is among threepoint distributions.
Theorem 1. For $d \geqslant 0$ and $\alpha>1/2$ , the problem (3.1) is equivalent to
in the sense that the two problems have the same optimal value. Moreover, the worstcase distribution of the problem (3.1) exists if and only if the worstcase distribution of the problem (3.3) exists, and any worstcase distribution of the problem (3.3) must be that of the problem (3.1).
Theorem 1 states that we can work on the set of threepoint distributions without loss of generality. We next give an example to illustrate that in general the worstcase distribution of the problem (3.1) is not unique and we may find distributions outside the set $S_3(\mu, \sigma)$ to attain the supremum.
Remark 2. Generally speaking, the worstcase distribution of the problem (3.1) is not unique. For example, letting $d=0$ , the problem (3.1) reduces to
In this special case, the optimal value is $ (1+\theta)\mu$ and the worstcase distribution is any feasible distribution. We also point out that the case $d=0$ means full reinsurance, which is a common reinsurance treaty in practice; see numerical results in Section 5.
From Theorem 1, we know that the worstcase distribution of problem (3.1) is among threepoint distributions with a more specific form. Denote by
a threepoint distribution of a random variable X with $\mathbb{P}(X=x_i)=p_i\geqslant 0$ , $i=1,2,3$ , where $0\leqslant x_1\leqslant x_2\leqslant x_3$ , $p_1+p_2+p_3=1$ , and $p_i\in[0,1]$ , $i=1,2,3$ . More specifically, we can get the following result from the proof of Theorem 1.
Corollary 1. For $d\geqslant 0$ , the problem (3.3) and thus, the problem (3.1) is equivalent to
where
3.2. Transformations of the main problem
In this subsection, we aim to transform problem (2.5) as a finitedimensional tractable problem based on Theorem 1 and Corollary 1. We first make the following observations. For any $F \in \mathcal{S}_3^*(\mu,\sigma;\;d)$ , by Proposition 1(i), one can verify that
and hence,
Combining with Theorem 1, we conclude that the infinitedimensional optimization problem (3.1) can be reduced to a finitedimensional optimization problem:
The fourth constraint in (3.8) guarantees that $e_\alpha^F(X \wedge d) \leqslant x_2$ . For any threepoint distribution $G=[x_1,p_1;\;x_2,p_2;\;x_3,p_3]\in \mathcal{S}_3(\mu,\sigma)$ satisfying $x_1\leqslant x_2 \leqslant e_\alpha^G(X \wedge d) \leqslant d \leqslant x_3$ , by Proposition 1(i), we obtain
The condition $x_2 \leqslant e_\alpha(X_{G} \wedge d)$ is equivalent to $(1\alpha)p_1(x_1x_2)+\alpha p_3(dx_2) \geqslant 0$ , which implies
Together with Corollary 1, dropping the fourth constraint in (3.8) still leads to the same maximum of (3.7) subject to all constraints in (3.8). Hence, we have the following theorem.
Theorem 2. The optimization problem (3.1) is equivalent to
in the sense that the two problems have the same optimal value. Moreover, the worstcase distribution of the problem (3.1) exists if and only if the optimal solution of the problem (3.9) exists, and the worstcase distribution of the problem (3.1) is $F^* =[x_1^*,p_1^*;\;x_2^*,p_2^*;\;x_3^*,p_3^*]$ if $(x_i^*,p_i^*,i=1,2,3)$ is the optimal solution of the problem (3.9).
With Theorem 2, we reduce the infinitedimensional optimization problem (3.1), or the inner problem of (2.4), to a finitedimensional optimization problem, which can be solved numerically. In Section 5, we will solve problem (2.4) numerically, where the inner problem is solved by the Matlab buildin function “fmincon” and the outer problem is solved via a grid search.
Lemma 1 and Theorem 1 imply that the worstcase value of problem (3.1) is increasing in $\sigma$ ; that is, the optimal value of the problem (3.1) is equivalent to the optimal value of the following problem
Therefore, we have the following reformulation of the problem (3.1).
Proposition 3. The problem (3.1) is equivalent to the following problem
in the sense that they have the same optimal value.
It is worth noting that the result in Proposition 3 is not easy to check due to the nonnegativity assumption. In what follows, we explain why the nonnegativity of the loss risk plays an essential role in the problem and makes the problem more complicated. If we drop the assumption of nonnegativity of the loss, by the translation invariance and positive homogeneity of $e_\alpha$ , it follows that for any risk $X$ with mean 0 and variance 1, $Y\;:\!=\;\mu +\sigma X$ has mean $\mu$ and variance $\sigma^2$ , and
where $d_1=\mu+\sigma d$ . Therefore, it suffices to consider the special case of the uncertainty set with $\mu=0$ and $\sigma=1$ . If the optimal deductible in the case of $\mu=0$ and $\sigma=1$ is $d^*$ , then the optimal deductible for the general case $(\mu,\sigma)$ is $\mu+\sigma d^*$ . However, with the constraint of nonnegativity, the above observations do not hold anymore.
The following proposition discusses the attainability of problems (3.11) and (3.9).
Proposition 4.

(i) The supremum value of the problem (3.11) is always attainable.

(ii) The supremum value of the problem (3.9) is attainable if one of the following conditions
(3.13) \begin{equation}{(1+\theta)[(1\alpha)(1\mu/d) + \alpha\mu /d]^2\leqslant \alpha(1\alpha), \;\mu< d,\;\mbox{and}\;\mu d < \mu^2+\sigma^2}\end{equation}is violated.
From Proposition 4, we have that for any $d\geqslant 0$ , there exists an $F^*$ with ${\mathbb{E}}^{F^*}[X]=\mu$ and ${\rm VaR}^{F^*}(X)\leqslant \sigma^2$ such that
where $f^{F^*}(d,X)\;:\!=\;e_\alpha^{F^*}(X \wedge d)+(1+\theta){\mathbb{E}}^{F^*}[(Xd)_+] $ . We also point out that if $F^*\neq [0,1\mu/d;$ $d,\mu/d]$ , then the problem (3.9) is attainable.
We close this section by showing that the main results in the paper can be generalized to the case with a higher order moment condition. That is, if we replace the constraint on variance by a constraint of a higher order moment, then the results that are parallel to Theorems 1 and 2 still hold. To be more specific, if the uncertainty set is replaced by
where $k>1$ , then similarly, we can show that Theorem 1 still holds. The following problem
is equivalent to
in the sense that the two problems have the same optimal value and optimal solution. More specifically, the worstcase distribution of the problem (3.14) exists if and only if the optimal solution of the problem (3.15) exists, and the worstcase distribution of the problem (3.14) is $F^* =[x_1^*,p_1^*;\;x_2^*,p_2^*;\;x_3^*,p_3^*]$ if $(x_i^*,p_i^*,i=1,2,3)$ is the optimal solution of the problem (3.15).
4. Proofs of the main results in Section 3
To prove Theorem 1, we need the following lemma.
Lemma 1. For $d \geqslant 0$ , $\sigma_1<\sigma_2$ and $\alpha \geqslant {1}/{2}$ , let $F\in S_3(\mu,\sigma_1)$ be a distribution such that $\mathbb{P}^F(X=x_i)=p_i$ , $i=1,2,3$ , with $x_1<x_2<x_3$ and
Then there exists $F^*\in S_3(\mu,\sigma_2)$ such that (4.1) holds,
Proof. Our proof mainly involves two steps: We first define a threepoint random variable $Y_c$ such that ${\mathbb{E}}[Y_c]={\mathbb{E}}^F[X]$ , ${\mathbb{E}}[(Y_cd)_+]\geqslant {\mathbb{E}}^F[(Xd)_+]$ , and $e_\alpha(Y_c \wedge d)=e_\alpha^{F}(X \wedge d)$ . Here in general it does not hold ${\mathrm{Var}}(Y_c)=\sigma_2^2$ , and thus, the next step is to modify the definition of $Y_c$ to obtain a desired distribution. To show it, for $c \in [0, p_2]$ , define a random variable
where
and
One can easily verify that ${\mathbb{E}}[Y_c]=\mu.$ We claim that $y> d$ . Indeed, as
it remains to show $p_3(x_3d)> a(x_1d)+c(dx_2)$ . Note that
where the last inequality follows from $x_1\leqslant e_\alpha^F(X \wedge d) \leqslant x_2 \leqslant d $ and $\alpha\ge1/2$ . As $x_3> d$ , we have $p_3(x_3d)>0\geqslant a(x_1d)+c(dx_2)$ . Hence,
Moreover, by standard computation, we have
where the inequality follows from (4.3). Next, we show that $e_\alpha(X_{F} \wedge d)=e_\alpha(Y_c \wedge d)$ . By Proposition 1(i), we know
which is equivalent to
It then follows from standard computation that
where the first equality follows from $e_\alpha^F(X \wedge d) \leqslant x_2 \leqslant d$ and the second equality follows from (4.6). By Proposition 1(i), we know that $e_\alpha^F(X \wedge d)$ is the $\alpha$ expectile of $Y_c \wedge d$ , that is,
Now we consider the following two cases:

(i) If there exists a $c^*\in[0,p_2]$ such that $\mathrm{Var}(Y_{c^*})=\sigma_2^2$ , combining with (4.5) and (4.7), we know that the distribution of $Y_{c^*}$ is the desired threepoint distribution.

(ii) Otherwise, $\mathrm{Var}(Y_c)<\sigma_2^2$ for all $c\in[0,p_2]$ and we need to define a new random variable. Note that for $c=p_2$ , the random variable $Y_c$ (defined in (4.2)) reduces to
\begin{align*}Y&=\left\{\begin{array}{ll}x_1,& \mbox{ with probability } q, \\[8pt] \dfrac{\muqx_1}{1q},& \mbox{ with probability }1q,\end{array}\right.\end{align*}where $q=p_1+\frac{\alpha(dx_2)p_2}{m}.$ By (4.4), we know $\frac{\muqx_1}{1q}> d$ . For $h \in [0, 1q)$ , define(4.8) \begin{align}Z_h&=\left\{\begin{array}{ll}x_1,& \mbox{ with probability } q, \\[5pt] d,& \mbox{ with probability } h,\\[5pt] z,& \mbox{ with probability } 1qh,\end{array}\right.\end{align}where $z\;:\!=\;\frac{\mux_1qdh}{1qh}$ . It is straightforward to verify that $z> d$ , ${\mathbb{E}}(Z_h)=\mu$ , $\mathrm{Var}(Z_0)=\mathrm{Var}(Y)$ and $e_\alpha(Z_h \wedge d)=e_\alpha(Y \wedge d)=e_\alpha(Y_c \wedge d)=e_\alpha^F(X \wedge d)$ . Moreover, note that\begin{align*} {\mathbb{E}}[(Z_hd)_+]&=(zd)(1qh)\\[5pt] &=(1q)\left[\frac{\mux_1q}{1q}d\right]\\[5pt] &={\mathbb{E}}[(Yd)_+]={\mathbb{E}}[(Y_cd)_+] \geqslant{\mathbb{E}}^F[(Xd)_+], \end{align*}where the last inequality follows from (4.5). As $\mathrm{Var}(Y_c)<\sigma_2^2$ for all $c\in[0,p_2]$ , by the continuity of $\mathrm{Var}(Y_c)$ with respect to c, we have $\mathrm{Var}(Y_{p_2})=\mathrm{Var}(Y)< \sigma_2^2$ . Noting that ${\lim_{h \to 1q}\mathrm{Var}(Z_h)=\infty}$ and $\mathrm{Var}(Z_0)=\mathrm{Var}(Y)< \sigma_2^2$ , there exists an $h^* \in [0,1q)$ such that $\mbox{Var}(Z_{h^*})=\sigma_2^2$ . In this case, the distribution of $Z_{h*}$ is the desired distribution $F^*$ .
Combining the above two cases, we complete the proof.
To better understand the main steps in Lemma 1, we give the following example to illustrate the twostep procedure involved in the proof.
Example 1. Suppose $d=6$ , $\alpha=0.9$ . Consider $S_3(\mu,\sigma_1)$ with $\mu=\frac{10}{3}$ , $\sigma_1=\frac{\sqrt{35}}{3}$ . Let $\sigma_2>\sigma_1$ and F be the distribution function of a discrete random variable X, where
Obviously, $F\in S_3(\mu,\sigma_1)$ . Moreover, $2<e_{\alpha}^F(X \wedge d)=\frac{107}{22}<5<6<7$ . Hence, X satisfies the conditions of Lemma 1. For $c \in [0,\frac{1}{6}]$ , define
where
It is easy to verify that ${\mathbb{E}}[Y_c]=\mu$ , $y>d$ and $e_{\alpha}(Y_c \wedge d)=\frac{107}{22}$ .

(i) If $\sigma_2^2\leqslant \dfrac{400}{63}=6.3492$ , we can always find a $c^*\in[0,1/6]$ such that $\mathrm{Var}(Y_{c^*})=\sigma^2_2$ ; see the left graph in Figure 1. One can verify that
\begin{align*}{\mathbb{E}}[(Y_cd)_+]\geqslant {\mathbb{E}}^F[(Xd)_+],\mbox{ } e_\alpha^F(X \wedge d)=e_\alpha(Y_c \wedge d).\end{align*} 
(ii) If $\sigma_2^2>6.3492$ , for all $c\in[0,1/6]$ , we have $\mathrm{Var}(Y_{c})<\sigma^2_2$ . For $c=\dfrac{1}{6}$ , the random variable $Y_c$ defined by (4.9) reduces to
\begin{align*} Y &=\left\{\begin{array}{ll} 2,& \mbox{ with probability } {\dfrac{25}{32}}, \\[10pt] \dfrac{170}{21},& \mbox{ with probability }\dfrac{7}{32}. \end{array}\right. \end{align*}For $h \in [0, 7/32)$ , define\begin{align*}Z_h&=\left\{\begin{array}{ll}2,& \mbox{ with probability } q, \\[5pt] 6,& \mbox{ with probability } h,\\[5pt] z,& \mbox{ with probability } 1qh,\end{array}\right.\end{align*}where $q=\dfrac{25}{32}$ , $ z=\dfrac{\frac{10}{3}2q6h}{1qh}$ . The right graph in Figure 1 plots the relationship between $\mathrm{Var}(Z_h)$ and h. One can verify that ${\mathbb{E}}[Z_h]= {10}/{3}=\mu$ ,\begin{align*}{\mathbb{E}}[(Z_hd)_+]\geqslant {\mathbb{E}}^F[(Xd)_+]\;\;\mbox{ and }\;\; e_\alpha^F(X \wedge d)=e_\alpha(Z_h \wedge d).\end{align*}
Now we are ready to prove Theorem 1.
Proof of Theorem 1. Let $f^F(d, X)=e_\alpha^F(X \wedge d)+(1+\theta) {\mathbb{E}}^F[(Xd)_+]$ for each $F \in S(\mu, \sigma)$ . Note that $e_\alpha^F(X \wedge d)\leqslant d$ as $X \wedge d\leqslant d$ . Let $A_1=\{X \leqslant e_\alpha^F(X \wedge d)\}$ , $A_2=\{e_\alpha^F(X \wedge d) < X \leqslant d\}$ , and $A_3=\{X>d\}$ . Denote ${\mathbb{E}}^F[XA_i]$ by $x_i$ , $i=1,2,3$ . Define a discrete random variable
Denote the distribution of $\tilde{X}$ by $\tilde{F}$ . Obviously $\tilde{F}$ is a threepoint distribution satisfying (4.1), that is, $x_1\leqslant e_\alpha^F(X \wedge d) \leqslant x_2\leqslant d< x_3.$ It follows that ${\mathbb{E}}^{\tilde{F}}[\tilde{X}]=\mu$ and
By Hölder’s inequality, we have $(\int_{A_i}xdF(x))^2 \leqslant (\int_{A_i}x^2dF(x))\mathbb{P}(A_i)$ , $i=$ 1, 2, 3. As a result,
Therefore, $\mathrm{Var}^{\tilde{F}}(\tilde{X})\leqslant \sigma^2=\mathrm{Var}^F(X)$ . Note that
and
By Proposition 1(i), we have $e_\alpha^{\tilde{F}}(\tilde{X} \wedge d)=e_\alpha^F(X \wedge d)$ . Hence, for any $F\in S(\mu,\sigma)$ , there exists a random variable $\tilde{X}$ defined in (4.10) following a threepoint distribution such that $f^{\tilde{F}}(d,\tilde{X})=f^F(d,X)$ . Next, we consider the following two cases.

(i) If $\mathrm{Var}^{\tilde{F}}(\tilde{X})=\sigma^2$ , then $\tilde{F} \in S_3(\mu, \sigma)$ and $f^{\tilde{F}}(d, \tilde{X})=f^F(d, X)$ . The result follows.

(ii) If $\mathrm{Var}^{\tilde{F}}(\tilde{X})<\sigma^2$ , by Lemma 1, there exists a threepoint distribution $F^*\in S_3(\mu, \sigma)$ such that ${\mathbb{E}}^{F^*}[X]={\mathbb{E}}^{\tilde{F}}[\tilde{X}]$ , $\mathrm{Var}^{F^*}(X)=\sigma^2$ , $e_\alpha^{F^*}(X \wedge d)=e_\alpha^{\tilde{F}}(\tilde{X} \wedge d)$ and ${\mathbb{E}}^{F^*}[(Xd)_+]\geqslant {\mathbb{E}}^{\tilde{F}}[(\tilde{X}d)_+]$ . Then we have $f^{F^*}(d, X)\geqslant f^{\tilde{F}}(d, \tilde{X})=f^F(d, X)$ .
Hence, for any $F \in S(\mu, \sigma)$ , we can find a threepoint distribution $F^* \in S_3(\mu, \sigma)$ such that $f^{F^*}(d, X)\geqslant f^F(d, X)$ . The proof is complete.
Remark 3. ^{Footnote 1} It is worth noting that if we extend the stoploss contract to $I(x)=c(xd)_+$ , where $c \in [0,1]$ , with one more parameter c introduced, the difficulty of solving the problem increases significantly. With $c=1$ , we have $XI(X)=X\wedge d$ , but for a $c\in[0,1)$ , $XI(X)=Xc(Xd)_+= X\wedge d + (1c) (Xd)_+$ , and the objective function is
as opposed to
in the case of stoploss contract. Generally speaking, we could not establish the same result in Theorem 1. However, we can show that for $I(x)=c(xd)_+$ , we can confine the worstcase distribution to the fourpoint distributions set $S_4(\mu,\sigma)$ with similar method in Theorem 1 and Lemma 1.
Proof of Proposition 4. (i) Denote by $f^*$ the optimal value of the problem (3.11). There exist feasible distributions $ F_n=[x_1^n,p_1^n;\; x_2^n,p_2^n;\;x_3^n,p_3^n] \in S_3 (\mu, \sigma)$ , $n\in\mathbb{N}$ , such that $f^{F_n}(d,X) \rightarrow f^*$ , where $f^{F}(d,X)$ is defined by (3.6). The constraints in (3.10) imply that $p_i^n\in [0,1]$ , $i=1,2,3$ and $x_1^n, x_2^n \in [0,d]$ for $k\in\mathbb{N}$ . We consider the following two cases.

(a) Suppose that $\{x_3^n,n\in\mathbb{N}\}$ is bounded. By Bolzano–Weierstrass Theorem, there exists a subsequence of $(x_1^n,x_2^n,x_3^n,p_1^n,p_2^n,p_3^n)$ converging to $(x_1^*,x_2^*,x_3^*,p_1^*,p_2^*,p_3^*)$ . By the uniform convergence theorem, one can verify that $(x_1^*,x_2^*,x_3^*,p_1^*,p_2^*,p_3^*)$ is also a feasible solution and $ f^{F^*}(d,X)=f^*$ , where $F^*=[x_1^*,p_1^*;\;x_2^*,p_2^*;\;x_3^*,p_3^*]$ , and thus $(x_1^*,x_2^*,x_3^*,p_1^*,p_2^*,p_3^*)$ is an optimal solution.

(b) Suppose that $\{x_3^n,k\in\mathbb{N}\}$ is unbounded. There exists a subsequence of $(x_1^{n_j},x_2^{n_j},x_3^{n_j},p_1^{n_j},p_2^{n_j},p_3^{n_j})$ such that it converges to $(x_1^*,x_2^*,\infty,p_1^*,p_2^*,p_3^*)$ . As $p_3^{n_j} \leqslant (\mu^2+\sigma^2)/(x_3^{n_j})^2$ , letting $n_j \rightarrow \infty$ yields $p_3^*=0$ . Denote by $F^*=[x_1^*,p_1^*;\;x_2^*,p_2^*]$ . Note that $\lim_{n_j \to \infty}x_3^{n_j}p_3^{n_j}=0$ because $x_3^{n_j}p_3^{n_j} \leqslant (\mu^2+\sigma^2)/x_3^{n_j}$ . One can verify that $ f^{F^*}(d,X)=\lim_{n_j\to\infty}f^{F_{n_j}}(d,X)=f^*$ . It also implies
\[\sum_{i=1}^2 p_i^*x_i^*= \lim_{n_j\to\infty} \sum_{i=1}^2 p_i^{n_j} x_i^{n_j} = \lim_{n_j\to\infty} \sum_{i=1}^3 p_i^{n_j} x_i^{n_j} =\mu\]and\[\sum_{i=1}^2 p_i^*(x_i^*)^2= \lim_{n_j\to\infty} \sum_{i=1}^2 p_i^{n_j} (x_i^{n_j})^2 \leqslant \lim_{n_j\to\infty} \sum_{i=1}^3 p_i^{n_j} (x_i^{n_j})^2 =\mu^2+\sigma^2.\]Therefore, $(x_1^*,x_2^*,d,p_1^*,p_2^*,0)$ is a feasible solution of the problem (3.11) and the optimal value is attained.
Combining the above two cases, we complete the proof of (i).

(ii) We consider the same two cases (a) and (b) as in the proof of (i). For case (a), it is obvious that $(x_1^*,x_2^*,x_3^*,p_1^*,p_2^*,p_3^*)$ is an optimal solution. For case (b), we have $ f^{F^*}(d,X)=f^*$ and $F^*=[x_1^*,p_1^*;\;x_2^*,p_2^*]$ where $(x_1^*,x_2^*,d,p_1^*,p_2^*,0)$ is a feasible solution of the problem (3.11). This implies $\sum_{i=1}^2 p_i^*(x_i^*)^2\leqslant \mu^2+\sigma^2$ . We will show $\sum_{i=1}^2 p_i^*(x_i^*)^2=\mu^2+\sigma^2$ by contradiction, which implies $(x_1^*,x_2^*,d,p_1^*,p_2^*,0)$ is an optimal solution. Suppose $\sum_{i=1}^2 p_i^*(x_i^*)^2<\mu^2+\sigma^2$ , we consider the following five cases.

(b.i) If $ 0 \leqslant x_1^*< x_2^*<d$ , define $G=[x_1^*, p_1^*+\delta; x_2^* + \varepsilon, p_2^*\delta]$ , where $\varepsilon= (x_2^*x_1^*)\delta/(p_2^*\delta)$ and $\delta\in (0,p_2^*)$ is small enough such that $\varepsilon \in (0,dx_2^*)$ and ${\mathbb{E}}^G[X^2]\leqslant \mu^2+\sigma^2$ . In this case, one can verify that $f^{G}(d,X) >f^*$ which yields a contradiction to that $(x_1^*, x_2^* + \varepsilon, d, p_1^*+\delta, p_2^*\delta,0)$ is a feasible solution of the problem (3.11).

(b.ii) If $0 \leqslant x_1^*=x_2^*<d$ , then it’s a degenerate distribution and $0 < \mu <d$ . Define $G=[\mu\varepsilon,\frac{1}{2};\;\mu+\varepsilon,\frac{1}{2}]$ , where $\varepsilon \in (0,\mu]$ is small enough such that $\mu+\varepsilon \leqslant d$ and ${\mathbb{E}}^G[X^2]\leqslant \mu^2+\sigma^2$ . In this case, one can verify that $f^{G}(d,X) >f^*$ which yields a contradiction to that $(\mu\varepsilon, \mu+\varepsilon, d, \frac{1}{2},\frac{1}{2},0)$ is a feasible solution of the problem (3.11).

(b.iii) If $0<x_1^* < x_2^*=d$ , define $G=[x_1^* \varepsilon, p_1^*\delta;\; x_2^*, p_2^*+\delta]$ , where $\varepsilon= (x_2^*x_1^*)\delta/(p_1^*\delta)$ and $\delta \in (0,p_1^*)$ are small enough such that $\varepsilon\in(0,x_1^*]$ and ${\mathbb{E}}^G[X^2]\leqslant \mu^2+\sigma^2$ . In this case, one can verify that $f^{G}(d,X) >f^*$ which yields a contradiction.

(b.iv) If $0<x_1^* = x_2^*=d$ , then it’s a degenerate distribution and $0 < \mu =d$ . Define $G=[\mu\varepsilon_1, q_1;\; \mu+\varepsilon_2, q_2]$ , which satisfies $y_1q_1+y_2q_2=\mu=d$ , $q_i>0$ , $i=1$ , 2 and $\varepsilon_1 \in (0,\mu]$ , then $\varepsilon_2=\frac{q_1}{1q_1}\varepsilon_1$ . There exist an $\varepsilon_1$ and a $q_1$ that are small enough such that ${\mathbb{E}}^G[X^2]\leqslant \mu^2+\sigma^2$ and $f^{G}(d,X) >f^*$ , a contradiction.

(b.v) If $0=x_1^*<x_2^*=d$ , then $p_1^*=1\mu/d$ and $p_2^*=\mu/d$ . One can calculate that
\[f^*=e_\alpha^{F^*}(X) = \frac{ \alpha\mu}{ (1\alpha)(1\mu/d)+\alpha\mu /d}.\]In this case, if $(1+\theta)[(1\alpha)(1\mu/d)+\alpha\mu /d]^2 >\alpha(1\alpha)$ , define $G=[0,p_1^*+\delta;\; d,p_2^*2\delta;\;2d, \delta]$ , where $\delta \in [0,p_2^*/2]$ is small enough such that\[\delta<\frac{\alpha(1\alpha)(1+\theta)[ (1\alpha)(1\mu/d)+\alpha\mu /d]^2 }{(1+\theta)(12\alpha)[(1\alpha)(1\mu/d)+\alpha\mu /d] }\]and ${\mathbb{E}}^G[X^2]\leqslant \mu^2+\sigma^2$ . We have\begin{align*} f^G(d,X) & = e_\alpha^G(X\wedge d) + (1+\theta) {\mathbb{E}}^G[(Xd)_+]\\[5pt] & =\frac{\alpha\mu\alpha d \delta}{(1\alpha)(1\mu/d+\delta)+\alpha(\mu /d\delta) } + (1+\theta) d\delta\\[5pt] & > \frac{ \alpha\mu}{ (1\alpha)(1\mu/d)+\alpha\mu /d}=f^*. \end{align*}This yields a contradiction.
Combining the above five cases, we complete the proof.
5. Numerical examples
This section provides numerical analyses of the problem (3.1). We study the impacts of the parameters $\theta$ , $\alpha$ , and $(\mu,\sigma)$ on the optimal reinsurance design. After that, we compare our robust results with those obtained in the classical reinsurance model when the loss distributions are assumed to be Gamma, Lognormal, and Pareto distributions, respectively. To have more insights into model uncertainty, we further compare our results with the robust reinsurance design with VaR and CVaR in Liu and Mao (Reference Liu and Mao2022).
5.1. Impacts of parameters
Tables 1–3. give the optimal deductibles and the optimal values of the distributionally robust reinsurance problem (2.4) for three pairs of $(\mu,\sigma)$ , where $\mu$ is 15 and $\sigma$ is 5, 10, 20.
Recall that the expected value premium principle is defined as $\pi\left(I(X)\right)=(1+\theta){\mathbb{E}}[I(X)]$ . This implies that for the same reinsurance coverage, the larger the $\theta$ is, the more expensive the reinsurance will be. In other words, larger $\theta$ would motivate the insurers to retain more risks by themselves instead of entering a reinsurance contract. Hence, the optimal deductible $d^*$ should increase in the same direction with $\theta$ . In addition, the confidence level $\alpha$ of an $\alpha$ expectile represents the risk tolerance of the insurer. The higher the $\alpha$ is, the more risksensitive the insurer is. Thus, the insurer would like to transfer more risk to the reinsurer by choosing a smaller deductible. The observations we made in Tables 1 and 2 align with our intuitions. The same logic applies to Table 3, but it is interesting to notice that when the insurers face significant uncertainty (large $\sigma$ ), they would prefer to transfer all risks to the reinsurer regardless of the price (see columns $\alpha = 0.9$ and $\alpha = 0.8$ of Table 3). Moreover, when the optimal contract is a zerodeductible plan ( $d^*=0$ ), then the corresponding objective function value reduces to $(1+\theta)\mu$ , which is only relevant to the safety loading factor ( $\theta$ ) and the expected loss (e.g. the two rows with $\theta=0.1$ and $\theta=0.2$ in Table 3).
Intuitively, when both the price of the reinsurance is expensive (large $\theta$ ) and the insurer is not risksensitive (small $\alpha$ ), then the insurer would prefer not to purchase any reinsurance. We can verify this result by looking at the right upper corner of all three tables where $d^* = \infty$ . Numerically, we set $d^*$ to be $\infty$ when the plot of the objective function values exhibits a decreasing yet converging trend as d increases. We verified our results by examining each of such scenarios with d up to 1000, which should be sufficient since the probability of a positive payoff for $d = 1000$ would be less than .
5.2. Comparison with classical reinsurance model
Here, we compare the optimal deductibles and the optimal objective function values obtained in our distributionally robust model with those obtained in the classical reinsurance model. We assume the loss random variable in the classical reinsurance model follows the commonly used distributions in insurance: Gamma, Lognormal, and Pareto distributions.

(i) (Lognormal distribution) Suppose that X follows a lognormal distribution with $\ln(X)\sim \mathrm{N}(\mu,\sigma^2)$ . Then ${\mathbb{E}}[X]=e^{\mu+\sigma^2/2}$ and ${\rm Var}(X) = e^{\sigma^2 + 2\mu}(e^{\sigma^2}1)$ .

(ii) (Pareto distribution) Suppose that X follows a Pareto distribution with cumulative distribution function $F(x)=1\left(\tau/x\right)^\beta$ for $x\geqslant \tau$ , where $\beta>1$ . Then ${\mathbb{E}}[X] =\dfrac{\beta \tau}{\beta1} $ and ${\rm Var}(X)=\dfrac{\beta \tau^{2}}{(\beta1)^{2}(\beta2)}$ .

(iii) (Gamma distribution) Suppose that X follows a gamma distribution with density function
\[f(x)=\frac{\tau^{\gamma}x^{\gamma1} e^{\tau x}}{\Gamma(\gamma)}, \quad x > 0,\]where $\gamma, \tau>0$ , and $\Gamma$ is the Gamma function defined by\[\Gamma(a)=\int_{0}^{\infty} t^{a1} e^{t} \, \mathrm{d} t.\]Then ${\mathbb{E}}[X]=\gamma/\tau$ and $\mathrm{Var}(X)=\gamma/\tau^2$ .
Recall that the classical reinsurance model corresponding to our distributionally robust model (2.5) is as follows:
where X follows a precise distribution. In order to make comparisons with our robust results, for each pair of $(\mu,\sigma)$ studied in the previous section, the parameters of the aforementioned models are set such that ${\mathbb{E}}[X]=\mu$ and $\mathrm{Var}(X)=\sigma^2$ . Table 4 gives the results for different values of $\sigma$ , and in order to have a better illustration on the effect of model uncertainty, we also include the premium/risk ratio under each model. The premium is the price of the reinsurance contract, and the risk retained by the insurer is simply the optimal values of our objective functions. It is not surprising to observe that under both robust and nonrobust cases, the premium/risk ratio is increasing with the riskiness of loss variable, measured by sigma. For $\sigma=3,5$ , the optimal deductible is strictly positive, and the premium/risk ratio in the robust case is moderately larger than those in the nonrobust case. However, if $\sigma$ is large, the optimal deductible is zero and the premium/risk ratio is much larger than those in the nonrobust case. The large differences in the premium/risk ratios between robust and nonrobust models illustrate the catastrophic consequences if the insurer failed to select the correct model, and the insurers need to be alert to model uncertainty. However, due to the limited data in the tail of the loss distribution, determining the true model is rather difficult, if not impossible, and our results under the robust design suggest the insurer should take more cautious actions.
Figure 2 plots the values of the objective function with respect to different deductibles d. The robustcase curve corresponds to the objective function in our distributionally robust reinsurance model, and the other three curves correspond to the classical reinsurance model when the distribution function is precisely known as Gamma, Lognormal, and Pareto distribution. The objective value in the robust case is consistently larger due to the model setting, and we do confirm that the risk may be underestimated if the distributional uncertainty is ignored. From the graphs in the first row of Figure 2, it is interesting to observe that the optimal reinsurance contract under the robust case is not necessarily the most conservative one in terms of the amount of loss been preserved by the insurer. However, when $\sigma$ becomes large, the optimal contract under the robust case eventually becomes the zerodeductible plan, whereas the optimal contracts under other three cases may still have moderate deductibles. This implies that a significant portion of the risk may be unintentionally held by the insurer if the optimal design is determined by a misspecified loss distribution. For example, as illustrated in the bottom right graph, if Pareto distribution is misused, the value of the expectile function will be underestimated by up to 25%. Model uncertainty plays a crucial role when the underlying risk is significant, and the distributionally robust optimization is able to provide a conservative benchmark.
5.3. Comparison with distributionally robust reinsurance model with VaR/CVaR
In this section, we compare optimal deductibles and optimal values obtained in the distributionally robust model with expectile with the model in Liu and Mao (Reference Liu and Mao2022) under VaR/CVaR. Tables 5 and 6 give the results for four different values of $\sigma$ with $\mu=15$ and $\theta=2$ . Following the conclusion in Bellini and Bernardino (Reference Bellini and Bernardino2017) that “for the most common distributions, expectiles are closer to the center of the distribution than the corresponding quantiles,” we choose a series of larger $\alpha$ ’s for $e_\alpha$ than for $\mathrm{VaR}_\alpha$ to make the comparison. Table 5 compares the optimal deductibles and optimal values when $\alpha=0.99$ for VaR and $\alpha=0.99, 0.991, 0.993,0.995$ for the expectile. The results suggest that for the same level $\alpha$ , the optimal deductible based on VaR is always smaller than that based on expectile, which means that the VaR users are more conservative and they prefer to transfer more risk to a reinsurer than the expectile user with the same level. Table 6 compares the optimal deductibles and optimal values when $\alpha=0.8$ for VaR and $\alpha=0.8, 0.81, 0.85,0.9$ for the expectile. In this case, we have similar observations, but the optimal deductibles for VaR user and expectile user vary more significantly as the difference between the two risk measures, expectile and CVaR, gets larger for a smaller level $\alpha$ . Both tables suggest that when $\sigma$ is large, the optimal deductibles are zeros and the optimal values are identical. Intuitively, a large $\sigma$ means more uncertainty, and the insurer would rather transfer all risk to the reinsurer concerning the remarkable uncertainty. In addition, we also want to mention that the optimal deductibles and the optimal values of CVaR at level $\alpha=0.8$ and $0.99$ are identical since the result based on CVaR satisfies the hybrid property. This is because that VaR/CVaRuser in distributionally robust reinsurance is indifferent to the level $\alpha$ satisfying $\sigma^2/\mu^2<\theta\leqslant \alpha/(1\alpha)$ . In sharp contrast, the optimal deductible based on expectile is continuous in the parameter $\alpha$ . From this perspective, the problem based on expectile is more reasonable.
6. Concluding remarks
In this paper, we investigate a distributionally robust reinsurance problem with expectile under model uncertainty, where the distribution of loss is partially known in the sense that only its mean and variance are known. By showing that the worstcase distribution must belong to the set of threepoint distributions, we reduced the infinitedimensional minimax problem to a finitedimensional optimization problem, which is a tractable optimization problem. By comparing the results with the classical reinsurance problem, the importance of including model uncertainty is presented, and we demonstrate the consequence of model misspecification may be severe if the underlying risk is significant. In the end, we want to point out that the characterization of the explicit solution of the worstcase distribution is very challenging, and we leave it for future work.
Acknowledgments
The authors thank the Editor and three anonymous referees for their insightful comments, which helped improve the paper.