1. Introduction
In statistical inference problems no fixed-sample-size procedure exists, sequential sampling schemes have been developed and widely used with efficiency properties proved in terms of the sample size required. The fundamental theories of sequential estimation are largely based on the ground-breaking papers [Reference Anscombe2, Reference Chow and Robbins3], in which purely sequential sampling methodologies were developed for the problem of constructing fixed-width confidence intervals (FWCIs).
In a parallel path, [Reference Robbins and Grenander27] originally formulated the minimum risk point estimation (MRPE) problem. Under the absolute error loss plus linear cost of sampling, a purely sequential stopping rule was proposed to estimate an unknown normal mean $\mu$ when the variance $\sigma^{2}$ was assumed unknown. Then, [Reference Starr31, Reference Starr and Woodroofe32] considered a more general loss function and proved a number of interesting asymptotic first- and second-order properties of the purely sequential MRPE methodology. Using nonlinear renewal theoretical tools developed in [Reference Lai and Siegmund13, Reference Lai and Siegmund14], [Reference Woodroofe34] further developed explicit second-order approximations associated with efficiency, risk efficiency, and regret. Moreover, [Reference Ghosh and Mukhopadhyay5] provided a different method to evaluate the expression for regret, which could generalize the corresponding result in [Reference Woodroofe34].
Let us begin with a sequence of independent and identically distributed (i.i.d.) positive and continuous random variables $\{W_{n},n\ge 1\}$ . For simplicity, we assume that these random variables have all positive moments finite, with mean $\mathbb{E}[W_{1}]=\theta$ and variance $\mathbb{V}[W_{1}]=\tau^{2}$ specified in particular. In addition, the distribution function of $W_{1}$ satisfies the condition $\mathbb{P}(W_{1}\le x)\le Bx^{\alpha}$ for all $x>0$ and for some $B>0$ and $\alpha>0$ , both free from x. In the light of [Reference Woodroofe34, (2.5)], we further assume this condition. One may wonder why this condition is seemingly irrelevant or unused anywhere, but it is a necessary condition for the results in Theorem 1. For more details, see [Reference Woodroofe34]. Similar to [Reference Woodroofe34, (1.1)], all the stopping times arising from the aforementioned inference problems can be written in a general form given by
where $\delta>0$ is a positive constant, $l_{1}(n)=1+l_{0}n^{-1}+o(n^{-1})$ as $n\rightarrow \infty$ with $-\infty <l_{0}<\infty$ a convergent sequence of numbers, $m \ge 1$ indicates a pilot sample size, and $n^{\ast}$ is called an optimal fixed sample size whose expression is to be determined in specific problems. See [Reference Mukhopadhyay and de Silva18, Section A.4] for more details.
The stopping rule (1) is implemented as follows. After an initial sample of size m is gathered, one observation is taken at a time as needed successively. Each time when a new observation is recorded, the sample data is evaluated to check with the stopping rule and sampling terminates at the first time that the stopping rule is satisfied. Therefore, this is a purely sequential sampling scheme; we denote it by $\mathcal{M}_{0}$ .
To introduce the properties of the stopping time $t_0$ and relate its expected value to the optimal fixed sample size, let us define a general function of $\eta(k)$ as follows. For each integer $k\ge 1$ ,
where $\{u\}^{+}=\max\{0,u\}$ . Under certain conditions, [Reference Woodroofe34] fully studied the properties of $t_{0}$ , which are summarized in the following theorem.
Theorem 1. For the purely sequential sampling scheme $\mathcal{M}_{0}$ and the stopping time $t_{0}$ given in (1), as $n^{\ast}\rightarrow \infty$ , if $m>(\alpha\delta)^{-1}$ , $\mathbb{E}_{\theta,\tau}[t_{0}-n^{\ast}] =\eta (1)+o(1)$ .
In the spirit of [Reference Hall7], we define the term ‘sampling operation’ as the procedure of collecting new observations and evaluating the sample data to make a decision. Let $\varphi$ denote the number of sampling operations. For the purely sequential sampling scheme $\mathcal{M}_{0}$ associated with the stopping time $t_{0}$ given in (1), we have $\varphi_{\mathcal{M}_{0}}=t_{0}-m+1$ and
Not surprisingly, the purely sequential sampling scheme $\mathcal{M}_{0}$ requires a lot of sampling operations. In view of this, [Reference Hall7] proposed an accelerated sequential estimation methodology saving sampling operations. Furthermore, [Reference Mukhopadhyay and Solanky22] and [Reference Mukhopadhyay16] developed alternative formulations of the accelerated sequential sampling technique. The purely sequential sampling scheme of Anscombe–Chow–Robbins was improved in [Reference Liu15], which proposed a new sequential methodology requiring substantially fewer sampling operations. Accelerated sequential sampling methodologies first draw samples sequentially part of the way and then augment with sampling in one single batch. As per the discussions from [Reference Mukhopadhyay and de Silva18, p. 228]: ‘An accelerated sequential strategy would always be operationally much more convenient in practical implementation than its sequential counterpart!’
On the other hand, [Reference Hayre8] considered sampling in bulk or groups, rather than one at a time, and proposed group sequential sampling with variable group sizes. A concept of sequential planning was presented in [Reference Schmegner and Baron29] as an extension and generalization of group sequential procedures. Implementation of these sampling schemes has been proved to require only a few groups to cross the stopping boundary, leading to only a moderate increase in sample size.
Most recently, [Reference Mukhopadhyay and Wang24] first brought up a new type of sequential sampling scheme which considers recording k observations at a time, given the thoughts that in real life packaged items purchased in bulk often cost less per unit than the cost of an individual item. This new sampling strategy was discussed in both FWCI and MRPE problems for the mean of a normal population. These problems were revisited in [Reference Mukhopadhyay and Wang25] which incorporated the newly constructed estimators under permutations within each group for the stopping boundaries, leading to tighter estimation of required sample sizes. A double-sequential sampling scheme was developed in [Reference Hu9] defined as k-at-a-time part of the way, and then one-at-a-time sequentially, which requires similar sample sizes to the purely sequential strategies but saves sampling operations. Sequential estimation strategies for big data science with minimal computational complexities were further proposed in [Reference Mukhopadhyay and Sengupta21], with the idea of k-tuples instead of one single observation.
In this paper, we incorporate these path-breaking ideas with modification to accelerate the purely sequential sampling scheme without sacrificing the first- and second-order efficiency: (i) using the purely sequential methodology to determine only a proportion $\rho(0<\rho<1)$ of the desired final sample, and then augmenting with sampling in one single batch; and (ii) drawing a fixed number $k(k\ge 2)$ observations at a time successively until termination in the sequential sampling portion. In this way, we expect to save roughly $100(1-k^{-1}\rho)\%$ of sampling operations for any given combination of k and $\rho$ values compared with a purely sequential strategy where $k=1$ and $\rho=1$ . Taking all possible combinations of $k\ge1$ and $0<\rho\le1$ values into consideration, we propose a novel and unified accelerated group sequential sampling scheme, denoted by $\mathcal{M}(\rho,k)$ , in Section 2 along with a number of desirable properties. The sampling scheme $\mathcal{M}(\rho,k)$ encompasses a wide range of sampling procedures with different selections of k and $\rho$ values:
-
(i) $k=1$ , $\rho=1$ , the purely sequential sampling scheme, which was originally established in [Reference Anscombe2, Reference Chow and Robbins3];
-
(ii) $k=1$ , $0<\rho<1$ , the accelerated sequential sampling scheme first established in [Reference Hall7], with a unified version proposed in [Reference Mukhopadhyay and Solanky22, Reference Mukhopadhyay16];
-
(iii) $k>1$ , $\rho=1$ , the k-at-a-time group sequential sampling scheme first brought up in [Reference Mukhopadhyay and Wang24, Reference Mukhopadhyay and Wang25];
-
(iv) $k\ge2$ , $0<\rho<1$ , the k-at-a-time accelerated group sequential sampling scheme proposed as a new sampling scheme in this paper.
Remark 1. This work provides a sampling scheme that is both unified and novel because it incorporates the traditional sampling strategies (purely sequential and accelerated sequential schemes), the relatively new k-at-a-time group sequential sampling scheme, and the new k-at-a-time accelerated sequential sampling scheme, all under one big umbrella. We also provide first- and second-order properties for this unified sampling scheme in general.
The rest of this paper is organized as follows. Section 2 proposes the accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$ and explores its appealing first- and second-order properties, with a special focus on the k-at-a-time accelerated group sequential sampling scheme. In Section 3, we construct MRPE for an unknown normal mean $\mu$ with the variance $\sigma^{2}$ also assumed unknown as a possible illustration of the newly proposed sequential sampling scheme $\mathcal{M}(\rho,k)$ . In Section 4, we construct bounded variance point estimation for an unknown location parameter $\mu$ from a negative exponential distribution. Simulated performances and real data analyses are included to support and supplement our theory for both methodologies in Sections 3 and 4. Section 5 shares some brief concluding thoughts.
2. The accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$
In this section, we provide a general framework of accelerated group sequential sampling in detail. We start with the stopping rules of sampling, and we also include the appealing properties under this sampling scheme.
Under the assumptions given in Section 1, we propose a novel and unified sequential sampling scheme $\mathcal{M}(\rho,k)$ associated with the following stopping times modified in view of (1):
In addition to the notation in the stopping rule (1), $k\ge 1$ is a prefixed integer, $0<\rho \le 1$ is a prefixed proportion, $l_{k}(n)=1+l_{0}(kn)^{-1}+o(n^{-1})$ as $n\rightarrow \infty $ with $-\infty<l_{0}<\infty $ is a convergent sequence of numbers, $U_{i}\equiv \sum_{j=(i-1)k+1}^{ik}W_{j}$ , $i=1,2,\ldots$ , are i.i.d. random variables, and $\lfloor u\rfloor$ denotes the largest integer that is strictly smaller than u ( $<u$ ).
The implementation of the stopping rule (4) can be interpreted as follows. Starting with km pilot observations, we sample k observations at a time as needed successively and determine a preliminary sample size of $kt_{1}(\rho,k)$ . Then, we continue to sample $t_{2}(\rho,k)-kt_{1}(\rho,k)$ additional observations if needed, all in one batch. Obviously, $\mathbb{P}_{\theta,\tau}(t_{2}(\rho,k)<\infty)=1$ and $t_{2}(\rho,k)\uparrow \infty $ with probability 1 as $n^{\ast}\uparrow\infty$ . If both $\rho$ and k are chosen to be 1, then our newly developed sampling scheme $\mathcal{M}(1,1)$ will be the purely sequential sampling scheme $\mathcal{M}_{0}$ associated with the stopping rule (1). If $\rho=1$ and $k\ge 2$ , the new sampling scheme $\mathcal{M}(1,k)$ is group sequential, but taking multiple (k) observations at a time. If $0<\rho <1$ and $k=1$ , the new sampling scheme $\mathcal{M}(\rho,1)$ is an accelerated sequential sampling scheme, similar to the one proposed in [Reference Mukhopadhyay16].
Compared with the purely sequential sampling scheme $\mathcal{M}_{0}$ , our newly developed sequential sampling scheme $\mathcal{M}(\rho,k)$ can reduce approximately $100(1-k^{-1}\rho)\%$ of the operational time, which makes it flexible in real practice. One is allowed to choose the values of k and $\rho$ to optimize the sampling process, time limitations, and cost considerations under different situations. We incorporate a brief discussion here to illustrate the flexibility of our sampling scheme.
If the cost of taking more observations is high, we can determine a smaller k value and/or choose $\rho$ to be closer to 1; and if the process is more time sensitive, we can use the methodology with a larger k and/or choose $\rho$ to be closer to 0. As a direct result of the operational convenience, the new sequential sampling scheme $\mathcal{M}(0<\rho <1,k\ge 2)$ tends to oversample, but the increase in the expected sample size is bounded by an amount depending on $\rho$ and k. Moreover, we have a thorough investigation of the efficiency properties for the sampling scheme $\mathcal{M}(\rho,k)$ in what follows.
We now establish an appealing property around this unified sampling scheme. We suppose that the following limit operation holds in the spirit of [Reference Hall6]:
where $r\ge1$ is a fixed constant. In the light of (1) and Theorem 1, we are now in a position to state the major results of this paper as Theorem 2. See [Reference Woodroofe34, Theorem 2.Reference Efron4] for more details.
Theorem 2. For the accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$ and the stopping times $t_{1}(\rho,k)$ and $t_{2}(\rho,k)$ given in (4), for fixed $0<\rho<1$ and k, with $\eta(k)$ defined in (2), under the limit operation (5):
And if $\rho=1$ , then $\mathbb{E}_{\theta,\tau}[t_{2}(1,k)-n^{\ast}]=\mathbb{E}_{\theta,\tau}[kt_{1}(1,k)-n^{\ast}]=\eta(k)+o(1)$ .
It is clear that when $0<\rho<1$ and $k\ge 2$ , our new sampling scheme $\mathcal{M}(\rho,k)$ is expected to oversample up to $\rho^{-1}\eta(k)+1+o(1)$ observations. In terms of the number of sampling operations, it is not hard to obtain that $\varphi_{\mathcal{M}(\rho,k)}=t_{1}-m+1+\mathbf{1}(\rho <1)$ , where $\mathbf{1}(D)$ stands for the indicator function of an event D. We also have
Comparing (3) and (7), the accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$ requires roughly $100(1-k^{-1}\rho)\%$ fewer sampling operations than those of the purely sequential sampling scheme $\mathcal{M}_{0}$ , based on the actual choices of k and $\rho$ . Therefore, it enjoys great operational convenience with the cost of only a slight increase in the projected final sample size.
Remark 2. According to Theorem 2, as the optimal sample size $n^*$ tends to infinity, the extra sample size is expected to be a finite number around $\rho^{-1}\eta(k)$ , while the saving in the number of sampling operations is expected to be $(1-k^{-1}\rho)n^*+O(1)$ . Given that the cost per sampled unit and the cost per sampled group are both positive constants, this indicates that the extra cost due to oversampling tends to be a finite number, while the saving due to acceleration tends to infinity. In this sense, accelerated group sequential sampling is still advantageous despite the possible overshoot. Moreover, as seen from the Monte Carlo simulation studies in Tables 2 and 6, the oversampling under discussion is within 5 for all the scenarios we simulated.
Hereafter, we mainly focus on the sampling scheme $\mathcal{M}(\rho,k)$ with $0<\rho<1$ and $k \ge 2$ , which makes it specifically the k-at-a-time accelerated group sequential sampling scheme. Nevertheless, note that all the theories and methodologies we discuss generally work for the sequential sampling scheme with $0<\rho\leq 1$ and/or $k\geq 1$ .
3. Minimum risk point estimation for a normal mean
In this section, we discuss minimum risk point estimation (MRPE) for a normal mean as an illustration of our accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$ . Having recorded a sequence of independent observations $X_{1},\ldots,X_{n}$ , $n\ge 2$ , from an $N(\mu,\sigma^{2})$ population where both $\mu\in\mathbb{R}$ and $\sigma\in\mathbb{R}^+$ are unknown, we denote the sample mean, the sample variance, and the sample standard deviation as follows:
According to [Reference Robbins and Grenander27], the MRPE for $\mu$ under the squared-error loss plus linear cost of sampling can be formulated as follows. Define the loss function by
where $A\,(>0)$ is a known weight function and c $(>0)$ is the known unit cost of each observation. Associated with the loss function in (8), we have the following risk function:
which is minimized at
with the resulting minimum risk
Note no fixed-sample-size procedure exists that achieves the exact minimum risk due to the fact that $\sigma$ is unknown. A fundamental solution is due to the purely sequential MRPE methodology in the light of [Reference Robbins and Grenander27, Reference Starr31, Reference Starr and Woodroofe32], briefly introduced below.
Since the population standard deviation $\sigma$ remains unknown, it is essential for us to estimate it customarily using the sample standard deviation $S_{n}$ , and update its value at every stage. We can start with m pilot observations, $X_{1},\ldots,X_{m}$ , $m\ge 2$ , and then sample one additional observation at a time as needed until the following stopping rule is satisfied:
It is clear that $\mathbb{P}_{\mu,\sigma}\{N_{\mathcal{P}_{0}}<\infty \}=1$ and $N_{\mathcal{P}_{0}}\uparrow \infty $ with probability 1 as $c\downarrow 0$ . Upon termination with the accrued data $\{N_{\mathcal{P}_{0}},X_{1},\ldots,X_{m},\ldots,X_{N_{\mathcal{P}_{0}}}\}$ , we estimate the unknown normal mean $\mu$ with $\bar{X}_{N_{\mathcal{P}_{0}}}\equiv N_{\mathcal{P}_{0}}^{-1}\sum_{i=1}^{N_{\mathcal{P}_{0}}}X_{i}$ . The achieved risk is then given by
To measure the closeness between the achieved risk in (12) and the minimum risk in (10), [Reference Robbins and Grenander27] and [Reference Starr31] respectively constructed the following two crucial notions, namely, the risk efficiency and regret:
Alternatively, the stopping rule (11) can be rewritten in the way we presented (1). By using the Helmert transformation, we express $N_{\mathcal{P}_{0}}=N_{\mathcal{P}_{0}}^{\prime}+1$ with probability 1, where the new stopping time $N_{\mathcal{P}_{0}}^{\prime}$ is defined as follows:
with $\delta=2$ , $l_{0}=2$ , and $W_{1},W_{2},\ldots$ being i.i.d. $\chi_{1}^{2}$ random variables such that $\theta=1$ , $\tau^{2}=1$ , and $\alpha=\frac12$ . From [Reference Robbins and Grenander27], [Reference Starr31], and [Reference Woodroofe34], we conclude the following theorem to address the asymptotic first- and second-order properties that the purely sequential MRPE methodology $\mathcal{P}_{0}$ enjoys. See [Reference Mukhopadhyay and de Silva18] for more details.
Theorem 3. For the purely sequential MRPE methodology $\mathcal{P}_{0}$ given in (11), for all fixed $\mu$ , $\sigma$ , m, and A, as $c\rightarrow 0$ :
-
(i) Asymptotic first-order efficiency: $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{P}_{0}}/n^{\ast}]\rightarrow 1$ if $m\geq 2$ ;
-
(ii) Asymptotic second-order efficiency: $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{P}_{0}}-n^{\ast}] = \eta_1(1)+o(1)$ if $m\geq 3$ , where $\eta_1(1)=-\frac{1}{2}\sum_{n=1}^{\infty}n^{-1}\mathbb{E}\big[\big\{\chi_{n}^{2}-3n\big\}^+\big]$ ;
-
(iii) Asymptotic first-order risk efficiency: $\xi_{\mathcal{P}_{0}}(c) \rightarrow 1$ if $m\geq 3$ ;
-
(iv) Asymptotic second-order risk efficiency: $\omega_{\mathcal{P}_{0}}(c)=\frac{1}{2}c+o(c)$ if $m\geq 4$ .
3.1. The accelerated group sequential MRPE methodology
Following (6), we propose an accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ :
Here, $0<\rho \le 1$ is a prefixed proportion, $k\ge 1$ is a prefixed positive integer, $m\ge 2$ again indicates a pilot sample size but picked such that $m-1\equiv 0{\pmod k}$ , and $\lfloor u\rfloor$ continues to denote the largest integer that is strictly smaller than u. Writing $m-1=m_{0}k$ for some integer $m_{0}\ge 1$ , we further assume that the following limit operation holds:
where $r\ge1$ is a fixed constant. The new methodology $\mathcal{P}(\rho,k)$ is implemented as follows.
Starting with $m(=m_{0}k+1)$ pilot observations, $X_{1},\ldots,X_{m}$ , we sample k observations at a time as needed and determine $T_{\mathcal{P}(\rho,k)}$ , which indicates the number of sequential sampling operations according to the stopping rule (14). Next, we continue to sample $(N_{\mathcal{P}(\rho,k)}-m-kT_{\mathcal{P}(\rho,k)})$ additional observations all in one batch. Upon termination, based on the fully gathered data
we construct the minimum risk point estimator $\bar{X}_{N_{\mathcal{P}(\rho,k)}}=N_{\mathcal{P}(\rho,k)}^{-1}\sum_{i=1}^{N_{\mathcal{P}(\rho,k)}}X_{i}$ for $\mu$ , and derive
Obviously, $\mathbb{P}_{\mu,\sigma}(N_{\mathcal{P}(\rho,k)}<\infty)=1$ and $N_{\mathcal{P}(\rho,k)}\uparrow \infty $ with probability 1 as $c\downarrow 0$ . If both $\rho$ and k are chosen to be 1, then the sequential MRPE methodology $\mathcal{P}(1,1)$ will be the purely sequential MRPE methodology $\mathcal{P}_{0}$ as per (11). That is, $\mathcal{P}(1,1)\equiv \mathcal{P}_{0}$ .
Along the lines of (13), we can similarly express the stopping time $T_{\mathcal{P}(\rho,k)}$ from (14) in the general form provided in (4). Define $T_{\mathcal{P}(\rho,k)}=T_{\mathcal{P}(\rho,k)}^{\prime}-m_{0}$ with probability 1. Then $T_{\mathcal{P}(\rho,k)}^{\prime}$ is a new stopping time that can be rewritten as
where $U_i=\sum_{j=(i-1)k+1}^{ik}W_j$ , $i=1,2,\ldots$ , with $W_1,W_2,\ldots$ being i.i.d. $\chi^2_{1}$ random variables. We list the simplified version of the equation for $T^{\prime}_{\mathcal{P}(\rho,k)}$ in (16) for easy reference:
where $\delta=2$ , $l_{0}=2$ , and $U_{i}=\sum_{j=(i-1)k+1}^{ik}W_{j}$ , $i=1,2,\ldots$ , with $W_{1},W_{2},\ldots$ being i.i.d. $\chi_{1}^{2}$ random variables such that $\theta=1$ , $\tau^{2}=2$ , and $\alpha=\frac12$ . Therefore, $U_{1},U_{2},\ldots$ are i.i.d. $\chi_{k}^{2}$ random variables.
Now we state a number of asymptotic first- and second-order properties of the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ , summarized in the following theorem.
Theorem 4. For the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ given in (14), for all fixed $\mu,\sigma,A, k$ and $0<\rho<1$ , under the limit operation (15):
-
(i) Asymptotic first-order efficiency: $\mathbb{E}_{\mu,\sigma}[ N_{\mathcal{P}(\rho,k)}/n^{\ast}]\rightarrow 1$ ;
-
(ii) Asymptotic second-order efficiency: $\rho^{-1}\eta_1(k) + o(1) \leq \mathbb{E}_{\mu,\sigma}[N_{\mathcal{P}(\rho,k)}-n^{\ast}] \le \rho^{-1}\eta_1(k)+1+o(1)$ , where $\eta_1(k)=({k-1})/{2} - \frac{1}{2}\sum_{n=1}^{\infty}n^{-1}\mathbb{E}\big[\big\{\chi_{kn}^{2}-3kn\big\}^{+}\big]$ ;
-
(iii) Asymptotic first-order risk efficiency: $\xi_{\mathcal{P}(\rho,k)}(c) \rightarrow 1$ ;
-
(iv) Asymptotic second-order risk efficiency: $\omega_{\mathcal{P}(\rho,k)}(c) = \frac{1}{2}\rho^{-1}c+o(c)$ .
Again, when $\rho=1$ , we have the exact expression $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{P}(1,k)}-n^{\ast}]=\eta_1(k)+o(1)$ instead of the inequality in Theorem 4(ii). The number of sampling operations for the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ is
and
For any integer $k\ge 1$ , $\eta_1(k)=({k-1})/{2}-\frac{1}{2}\Sigma_{n=1}^{\infty}n^{-1}\mathbb{E}\big[\big\{\chi_{kn}^{2}-3kn\big\}^{+}\big]$ is computable. In order to obtain numerical approximations, we wrote our own R code and provide the values in Table 1. In the spirit of [Reference Mukhopadhyay and Solanky23, Table 3.8.1], any term smaller than $10^{-15}$ in magnitude was excluded in the infinite sum with regard to $\eta_1(k)$ . Intuitively, the infinite sum and $k^{-1}\eta_1(k)$ converge to zero and $\frac12$ respectively as $k\rightarrow \infty $ . However, by looking at the columns for $\eta_1(k)$ and $k^{-1}\eta_1(k)$ , we can see that the infinite sum converges very quickly, while $k^{-1}\eta_1(k)$ converges at a rather slow rate.
Remark 3. The sign of $\eta(k)$ may imply whether a sequential sampling procedure leads to fewer (negative) or more (positive) observations than the optimal sample size on average upon termination. However, the sign depends on the specific inference problem and population distribution, and can be either positive or negative even when $k=1$ . This is because the magnitude of $\eta(k)$ involves problem-specific parameters $\delta$ , $\theta$ , $\tau^2$ , and $l_0$ .
Remark 4. Readers may have found that our loss function (8), while classical in sequential analysis, only combined estimation error and the cost per sampled unit. The cost per sampled group, however, is not included. In fact, it can be of significance to account for both the cost per sampled group and the cost per sample unit in the loss function. For example, [Reference Schmegner and Baron29] considered a loss function given by $L_N \equiv L_N(\lambda,\hat{\lambda}_N)+\sum_{j=1}^{T}(cN_j+a)$ , where c is the cost per sampled unit, a is the cost per sampled group, $\lambda$ is the parameter under estimation, and $L_{N}(\lambda,\hat{\lambda}_N)$ represents the incurred loss due to the estimator $\hat{\lambda}_N$ and the total sample size $N=\sum_{j=1}^{T}N_j$ , where T indicates the total number of sampled groups and $N_j$ is the sample size of the jth group, $j=1,\ldots,T$ . Under this loss function, both the optimal group sizes $N_j$ and the optimal number of groups T will need to be determined sequentially, which means they are both stopping variables. In our proposed accelerated group sequential sampling scheme, however, the group size k is a prefixed constant. This difference will complicate the sampling scheme, so we have neglected the cost per sampled group in the loss function (8). However, this would be a very interesting future project to work on.
3.2. Simulated performance
To investigate the appealing properties of the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ , and illustrate how it saves sampling operations with $0<\rho <1$ and/or $k\geq 2$ , we conducted extensive sets of Monte Carlo simulations under the normal case in the spirit of [Reference Mukhopadhyay and Hu19]. To be specific, we generated pseudorandom samples from an $N(5,2^{2})$ population. While fixing the weight function $A=100$ , the pilot sample size $m=21$ , we selected a wide range of values of c, the unit cost of sampling, including $0.04$ , $0.01$ , and $0.0025$ , so that the optimal fixed sample size $n^{\ast}$ turned out to be 100, 200, and 400 accordingly. We also considered various combinations of $\rho=(1,0.8,0.5)$ and $k=(1,2,5)$ to compare the number of sampling operations under different possible scenarios. The findings are summarized in Table 2. For each methodology $\mathcal{P}(\rho,k)$ , we computed the average total final sample size $\bar{n}$ with the associated standard error $s(\bar{n})$ , the difference between $\bar{n}$ and $n^{\ast}$ to be compared with the second-order efficiency term in Theorem 4(ii), the estimated risk efficiency $\widehat{\xi}$ to be compared with 1, the estimated regret in terms of unit cost $\widehat{\omega}/c$ to be compared with $\frac{1}{2}\rho^{-1}$ from Theorem 4(iv), and the average number of sampling operations $\bar{\varphi}$ to be compared with the expected number of sampling operations $\mathbb{E}(\varphi)$ from (18).
It is clear that across the board, $\bar{n}-n^*$ is close to the second-order approximation $\rho^{-1}\eta_1(k)$ , $\hat{\xi}$ is close to 1, and $\hat{\omega}/c$ is close to the coefficient $\frac{1}{2}\rho^{-1}$ . These empirically verify Theorem 4. Focusing on the last two columns, we can also easily find that the average number of sampling operations needed, $\bar{\varphi}$ , is almost the same as the theoretical value $\mathbb{E}(\varphi)$ , and the accelerated group sequential MRPE procedure $\mathcal{P}(\rho,k)$ reduces by approximately $100(1-k^{-1}\rho)\%$ sampling operations in comparison to the Anscombe–Chow–Robbins purely sequential procedure $\mathcal{P}(1,1)$ . For example, when $n^*=400$ , $\mathcal{P}(1,1)$ requires around 380 sampling operations on average, while $\mathcal{P}(0.8,5)$ requires 62. So about $100(1-62/380)\%=83.7\%$ sampling operations are saved, which is close to $100(1-0.8/5)\%=84\%$ .
3.3. Real data analysis
Next, to illustrate the applicability of our newly developed accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ , we proceeded to analyze a real-life dataset on hospital infection data from [Reference Kutner, Nachtsheim, Neter and Li12]. This data is from 113 hospitals in the United States for the 1975–76 study period. Each line of the data set has an identification number and provides information on 11 other variables for a single hospital. One of these variables is the infection risk, which records the average estimated probability of acquiring an infection in hospital (in percent). With the cost of observations taken into consideration, it is of great interest to propose an MRPE for the infection risk.
We treated the real dataset on the infection risk, which seemed to follow a normal distribution, confirmed via the Shapiro–Wilk normality test with the associated p-value of $0.1339$ . The simple descriptive statistics from the whole dataset of infection risk are summarized in Table 3.
For illustrative purposes, we treated this dataset of infection risk with size 113 from [Reference Kutner, Nachtsheim, Neter and Li12] as our population, with both mean and variance assumed unknown. Then, we performed our accelerated group sequential MRPE methodologies to obtain minimum risk point estimators for the infection risk. To start, we first randomly picked $m=11$ observations as a pilot sample, based upon which we proceeded with sampling according to the methodologies $\mathcal{P}(\rho,k)$ with $A=100$ , $c=0.04$ , $\rho=(1,0.8,0.5)$ , $k=(1,2,5)$ . We summarize the terminated sample sizes as well as the associated numbers of sampling operations under each setting in Table 4, where $\mathcal{P}(\rho,k)$ denotes a certain sampling procedure with fixed values of $\rho$ and k, and $n_{\mathcal{P}(\rho,k)}$ and $\varphi_{\mathcal{P}(\rho,k)}$ indicate the respective terminated sample size and number of sampling operations performing $\mathcal{P}(\rho,k)$ accordingly.
From Table 4, we can see that our terminated sample size ranges from 54 to 77. Apparently, many fewer sampling operations are needed when we fix $k=2,5$ , without increasing to a significant number of observations. Also, we need the fewest observations with the fewest sampling operations when we use the methodology $\mathcal{P}(\rho,k)$ with $\rho=0.5$ , compared with larger $\rho=0.8$ or 1, for a fixed k value. The point estimates constructed from each sampling procedure are listed in the last column, and they were close to each other. Finally, we should reiterate that each row in Table 4 was obtained from one single run, but shows the practical applicability of our accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ . We have indeed repeated similar implementations, but no obvious difference appeared. Consequently, we have left out many details for brevity.
4. Bounded variance point estimation for negative exponential location
For a fixed sample size, however large it is, the variance of an estimator can be larger than a prescribed level to an arbitrary extent. This problem was addressed in [Reference Hu and Hong10], where the authors focused on estimating the pure premium in actuarial science. Here, our newly proposed accelerated group sequential sampling scheme $\mathcal{M}(\rho,k)$ can be implemented to guarantee that the variance of our estimator is close to all small predetermined levels. In this section, therefore, we include another illustration: the bounded variance point estimation (BVPE) for the location parameter $\mu$ of a negative exponential distribution $\textrm{NExp}(\mu,\sigma)$ with the probability density function
where both $\mu\in\mathbb{R}$ and $\sigma\in\mathbb{R}^{+}$ remain unknown. Having recorded a random sample $Y_1,\ldots,Y_n$ , $n\ge2$ , we write $Y_{n:1}=\min\{Y_1,\ldots,Y_n\}$ , which is the maximum likelihood estimator (MLE) of $\mu$ , and $V_n=(n-1)^{-1}\sum_{i=1}^{n}(Y_i-Y_{n:1})$ , which is the uniformly minimum variance unbiased estimator (UMVUE) of $\sigma$ . As a standard approach, we estimate $\mu$ using its MLE $Y_{n:1}$ , which is a consistent estimator.
It is well known that (i) $n(Y_{n:1}-\mu)/\sigma \sim \textrm{NExp}(0,1)$ ; (ii) $2(n-1)V_n/\sigma \sim \chi^2_{2n-2}$ ; and (iii) $Y_{n:1}$ and $(V_2,\ldots,V_n)$ , $n\ge2$ , are independent. Hence, the variance of the proposed point estimator $Y_{n:1}$ is $\mathbb{V}_{\mu,\sigma}[Y_{n:1}] = {\sigma^2}/{n^2}$ . Now, our goal is to make $\mathbb{V}_{\mu,\sigma}[Y_{n:1}]$ fall below (or be close to) a predetermined level $b^2$ , $b>0$ , for all $0<\sigma<\infty$ . Then, it is clear that we have $n \ge \sigma/b$ . The optimal fixed sample size is therefore given by
See [Reference Mukhopadhyay and de Silva18, p. 183] for more information.
Since $\sigma$ is unknown to us, we estimate it by updating its UMVUE $V_n$ at every stage as needed, and implement the following accelerated group sequential BVPE methodology $\mathcal{Q}(\rho,k)$ :
Here, $0<\rho \le 1$ is a prefixed proportion, $k\ge 1$ is a prefixed positive integer, and the pilot sample size $m=m_0k+1$ for some $m_0$ . We further assume that the following limit operation holds:
where $r\ge1$ is a fixed constant. The methodology $\mathcal{Q}(\rho,k)$ is conducted analogously to the methodology $\mathcal{P}(\rho,k)$ introduced in Section 3.
Again, it is clear that $\mathbb{P}_{\mu,\sigma}(N_{\mathcal{Q}(\rho,k)}<\infty)=1$ and $N_{\mathcal{Q}(\rho,k)}\uparrow \infty $ with probability 1 as $b\downarrow 0$ . Upon termination with the fully gathered data
we construct the bounded variance point estimator $Y_{N_{\mathcal{Q}(\rho,k)}:1}=\min\{Y_1,\ldots,Y_{N_\mathcal{Q}(\rho,k)}\}$ for $\mu$ .
Along the lines of (16), we define $T_{\mathcal{Q}(\rho,k)}=T_{\mathcal{Q}(\rho,k)}^{\prime}-m_{0}$ with probability 1. Then $T_{\mathcal{Q}(\rho,k)}^{\prime}$ is a new stopping time that can be rewritten as
where $\delta=1$ , $l_{0}=1$ , and $U_{i}=\sum_{j=(i-1)k+1}^{ik}W_{j}$ , $i=1,2,\ldots$ , with $W_{1},W_{2},\ldots$ being i.i.d. $\chi_{2}^{2}$ random variables such that $\theta=2$ , $\tau^{2}=4$ , and $\alpha=1$ . Therefore, $U_{1},U_{2},\ldots$ are i.i.d. $\chi_{2k}^{2}$ random variables. Now we state a number of asymptotic first- and second-order properties of the accelerated group sequential BVPE methodology $\mathcal{Q}(\rho,k)$ , summarized in the following theorem.
Theorem 5. For the accelerated group sequential BVPE methodology $\mathcal{Q}(\rho,k)$ given in (20), for all fixed $\mu,\sigma,k$ and $0<\rho<1$ , under the limit operations (21):
-
(i) Asymptotic first-order efficiency: $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{Q}(\rho,k)}/n^{\ast}] \rightarrow 1$ ;
-
(ii) Asymptotic second-order efficiency: $\rho^{-1}\eta_2(k)+o(1)\leq\mathbb{E}_{\mu,\sigma}[N_{\mathcal{Q}(\rho,k)}-n^{\ast}]\le\rho^{-1} \eta_2 (k)+ 1+ o(1)$ , where $\eta_2(k)=({k-1})/{2} - \frac{1}{2}\sum_{n=1}^{\infty}n^{-1}\mathbb{E}\big[\big\{\chi_{2kn}^{2}-4kn\big\}^{+}\big]$ ;
-
(iii) Asymptotic variance: $\mathbb{V}_{\mu,\sigma}[Y_{N_{\mathcal{Q}(\rho,k)}:1}]=b^2+o(b^2)$ .
When $\rho=1$ , we have the exact expression $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{Q}(1,k)}-n^*]=\eta_2(k)+o(1)$ instead of the inequality in Theorem 4(ii). The number of sampling operations for the accelerated group sequential BVPE methodology $\mathcal{Q}(\rho,k)$ is
and
For any integer $k\ge 1$ , $\eta_2(k)=({k-1})/{2}-\frac{1}{2}\sum_{n=1}^{\infty}n^{-1}\mathbb{E}\big[\big\{\chi_{2kn}^{2}-4kn\big\}^{+}\big]$ is also computable. Table 5 provides some numerical approximations in the same fashion as Table 1.
4.1. Simulated performance
In this section, we summarize selective Monte Carlo simulation results to demonstrate the appealing properties, including both first and second order, of the accelerated group sequential BVPE methodologies that we provided in (20). We investigated a wide range of scenarios in terms of the location and scale parameters of the negative exponential population (NExp), as well as the prespecified parameters: the parameter $b^2$ for the bounded variance, and the two parameters, $\rho$ and k, of $\mathcal{Q}(\rho,k)$ for using different sampling schemes. For brevity, we summarize the results from pseudorandom samples of an NExp(5,2) population in Table 6. We specified $b^2=0.0004$ , $0.0001$ , and $0.000\,025$ , and the optimal fixed sample size $n^{\ast}$ turned out to be 100, 200, and 400 accordingly. To compare the number of sampling operations under different possible scenarios, we considered the combinations of $\rho=(1.0,0.8,0.5)$ and $k=(1,2,5)$ . For each sampling scheme of $\mathcal{Q}(\rho,k)$ , we included the average total final sample size $\bar{n}$ with the associated standard error $s(\bar{n})$ , the difference between $\bar{n}$ and $n^{\ast}$ to be compared with the second-order efficiency term, $\rho^{-1}\eta_2(k)$ , in Theorem 4(ii), $\mathbb{V}(Y_{N:1})$ , which should be close to the asymptotic variance as listed in Theorem 4(iii), and the average number of sampling operations $\bar{\varphi}$ to be compared with the expected number of sampling operations $\mathbb{E}(\varphi)$ from (24).
From Table 6, it is obvious that $\bar{n}-n^*$ hangs tightly around each of the second-order approximations $\rho^{-1}\eta_2(k)$ . From the sixth and seventh columns, we can also easily see that the average number of sampling operations needed, $\bar{\varphi}$ , is very close to its theoretical value $\mathbb{E}(\varphi)$ . Moreover, the sampling operations for $\mathcal{Q}(\rho,k)$ are significantly reduced when $0<\rho<1$ and/or $k>1$ , compared to the Anscombe–Chow–Robbins purely sequential procedure $\mathcal{Q}(1,1)$ under the same b. And the reductions are approximately $100(1-k^{-1}\rho)\%$ . The last column of Table 6 shows that the variance of the smallest observations is approximately $b^2$ across the board.
4.2. Real data analysis
In this section, we implement the accelerated group sequential BVPE methodology $\mathcal{Q}({\rho,k})$ as per (20) on a real dataset about survival times of a group of patients suffering from head and neck cancer who were treated using a combination of radiotherapy and chemotherapy; the dataset has been presented in multiple research articles [Reference Efron4, Reference Shanker, Fesshaye and Selvaraj30, Reference Zhuang and Bapat35].
It is fair to assume that the survival time data follows a negative exponential distribution as claimed in [Reference Zhuang and Bapat35]; a Kolmogorov–Smirnov test yielded a test statistic value of $0.156\,86$ with a p-value of $0.5572$ . Assuming that researchers in this study want to use the smallest observation of sample data to estimate the location parameter $\mu$ , and they also want to restrict the variance of the estimator to be $b^2=100$ , we implement the sampling scheme $\mathcal{Q}({\rho,k})$ in this investigation with a combination of $\rho= (1,0.8,0.5)$ and $k=(1,2,5)$ .
For each sampling scheme $\mathcal{Q}_{\rho,k}$ , the sampling procedure is the following: we randomize all of the observations but pretend these observations are not known to us. We start with $m=11$ observations, and proceed with the sampling following the procedures as per (20). We also assume the data comes in the order after randomization. We summarize the terminated sample size and the corresponding number of sampling operations in Table 7. The columns are defined similarly to Table 4.
We can see from Table 7 that the terminated sample size ranges from 28 to 42, with the least number of observations when using the sampling scheme $\mathcal{Q}(\rho=1,k=1)$ and the most number of observations when using $\mathcal{Q}(\rho=0.5,k=5)$ . Moreover, the sampling operations are reduced the most when using $\mathcal{Q}(\rho=0.5,k=5)$ . Also, with the same $\rho$ , larger k means fewer sampling operations; with the same k, smaller $\rho$ means fewer sampling operations. The last column records the point estimates, which were the minimum survival times observed in each sampling procedure. We should emphasize that all of these results were obtained from one single run, but we have indeed repeated similar implementations and there was little to no difference. We also want to emphasize that the real data example is only for illustration purposes. The example shows how our developed methodology can be used in real research problems.
5. Proofs
Note that Theorems 1 and 2 follow immediately from [Reference Woodroofe34, Theorem 2.4], (3) follows from Theorem 1 and the definition of $\varphi_{\mathcal{M}_0}$ , (7) follows from Theorem 2, and Theorem 3 is paraphrased from [Reference Mukhopadhyay and de Silva18, (6.4.14)]. In this section, therefore, we only prove Theorems 4 and 5.
5.1. Proof of Theorem 4 and (15)
By the stopping rule defined in (14), we have the following inequality:
where $m+kT_{\mathcal{P}(\rho,k)}<m+k+\rho S_{m+k(T_{\mathcal{P}(\rho,k)}-1)}\sqrt{A/c}$ . Therefore,
On the other hand, we also have the inequality that
from which we conclude that
Combining this with (26), it is clear that
Note that, for sufficiently small c, with the limit operation given in (15), we have (with probability 1)
Since $(\mathbb{E}_{\mu,\sigma}[\sup\nolimits_{n\ge 2}S_{n}])^2 \leq \mathbb{E}_{\mu,\sigma}[(\sup\nolimits_{n\geq 2}S_{n})^{2}]\leq \mathbb{E}_{\mu,\sigma}[\sup\nolimits_{n\ge 2}S_{n}^{2}]$ , and Wiener’s ergodic theorem [Reference Wiener33, Theorem IV] leads to $\mathbb{E}_{\mu,\sigma}[\sup\nolimits_{n\ge 2}S_{n}^{2}]<\infty $ , combined with (28) it follows by the dominated convergence theorem that $\mathbb{E}_{\mu,\sigma}[N_{\mathcal{P}(\rho,k)}/n^{\ast}]\rightarrow 1$ as $c\rightarrow 0$ . Since $c=A\sigma^{2}/n^{\ast 2}$ from (9), Theorem 4(i) holds under the limit operations (15).
Recall that $T_{\mathcal{P}(\rho,k)}^{\prime}$ defined in (16) is of the same form as $t_{0}$ from (1). Then, referring to [Reference Woodroofe34, (1.1)] or [Reference Mukhopadhyay and de Silva18, Section A.Reference Efron4], we have, as $c\rightarrow 0$ , for $m_{0}\geq 2$ ,
Therefore, with $T_{\mathcal{P}(\rho,k)}^{\prime}=T_{\mathcal{P}(\rho,k)}+m$ with probability 1 and $m=m_{0}k+1$ , we have
Putting together (17) and (29), we obtain (18). And under the limit operations (15), Theorem 4(ii) follows immediately from (29) and inequalities given in (25) and (27).
Next, we state the following lemmas to derive the desirable results in Theorem 4(iii) and (iv).
Lemma 1. For the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ given in (14), under the limit operations (15), for any arbitrary $0< \varepsilon <1$ , with some $\gamma \ge 2$ ,
Proof. Recall that $\lfloor u\rfloor$ denotes the largest integer that is smaller than u, and define
It should be obvious that $0\le T_{\mathcal{P}(\rho,k)}\le t_{u}$ . Then, the rate at which $\mathbb{P}_{\mu,\sigma}\{N_{\mathcal{P}(\rho,k)}\le \varepsilon n^{\ast}\}$ may converge to zero under the limit operations (15) is given by
where the last inequality comes from Kolmogorov’s inequality for reversed martingales. See [Reference Hu and Mukhopadhyay11, Section 5.1] for more details.
Lemma 2. For the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ given in (14), under the limit operations (15),
Proof. First, we prove that
based on the inequalities
It is not hard to see that $(\lfloor\rho n^{\ast}\rfloor + 1)^{1/2}\big(S_{\lfloor\rho n^{\ast}\rfloor+1}/\sigma-1\big) \overset{\textrm{d}}{\rightarrow} N\big(0,\tfrac{1}{2}\big)$ as $c\rightarrow 0$ , and the sequence $\{S_{\lfloor\rho n^{\ast}\rfloor+1}\}$ is uniformly continuous in probability [Reference Anscombe1, Reference Anscombe2]. From previous results, we can easily show that
Now, Anscombe’s random central limit theorem [Reference Anscombe1] leads to
as $c\rightarrow 0$ . Hence, (30) holds. Next, with the inequalities that
Lemma 2(i) follows immediately, and Slutsky’s theorem provides Lemma 2(ii) under the limit operations (15).
Lemma 3. For the accelerated group sequential MRPE methodology $\mathcal{P}(\rho,k)$ given in (14), under the limit operations (15), $\big(N_{\mathcal{P}(\rho,k)}-n^{\ast}\big)^{2}/n^{\ast}$ is uniformly integrable.
Proof. In the light of [Reference Hu and Mukhopadhyay11, Theorem 3.4], we can prove that, for sufficiently small $c\le c_{0}$ by choosing some $c_{0}$ $(>0)$ appropriately, $(\rho n^{\ast})^{-1}\big(m+kT_{\mathcal{P}(\rho,k)}-\rho n^{\ast}\big)^{2}$ is uniformly integrable. Therefore, under the limit operations (15), we have Lemma 3 by applying the inequalities given in (31).
Now, Theorem 4(iii) and (iv) follow from Lemmas 1–3. Alternatively, appealing to nonlinear renewal theory, we can also prove the same results in the spirit of [Reference Woodroofe34]. Many details are left out for brevity.
5.2. Proof of Theorem 5 and (23)
In the same fashion as we proved Theorem 4(i), we have
As $b \to 0$ , the two bounds of these inequalities both tend to 1 in probability, so
For sufficiently small b, with the limit operation (21), we have (with probability 1)
where $2(n-1)V_n/\sigma \sim \chi^2_{2n-2}$ . Similarly, $\mathbb{E}_{\mu,\sigma}[\sup\nolimits_{n\ge 2}V_{n}]<\infty$ follows from Wiener’s ergodic theorem [Reference Wiener33, Theorem IV] so that $\mathbb{E}_{\mu,\sigma}[{N_{\mathcal{Q}(\rho,k)}}/{n^*}] \to 1$ as $b \to 0$ . The proof of Theorem 4(i) is complete.
Then, recall that $T_{\mathcal{Q}(\rho,k)}^{\prime}$ defined in (22) is of the same form as $t_{0}$ from (1). So, referring to [Reference Woodroofe34, (1.1)] or [Reference Mukhopadhyay and de Silva18, Section A.4], we have, as $b\rightarrow 0$ , for $m_{0}\geq 2$ ,
Therefore, with $T_{\mathcal{Q}(\rho,k)}^{\prime}=T_{\mathcal{Q}(\rho,k)}+m$ with probability 1 and $m=m_{0}k+1$ , we have
Putting together (23) and (32), we obtain (24). And under the limit operations (21), Theorem 4(ii) follows immediately from (32).
To evaluate the asymptotic variance in Theorem 4(iii), we utilize the law of total variance and obtain
since the event $\{N_{\mathcal{Q}(\rho,k)}=n\}$ depends on $V_n$ alone and is therefore independent of $Y_{n:1}$ .
Applying Taylor’s theorem to expand $N_{\mathcal{Q}(\rho,k)}^{-j}$ , $j\ge1$ , around $n^*$ , we have
where $\lambda$ is a random variable lying between $N_{\mathcal{Q}(\rho,k)}$ and $n^*$ . Combining (32), (34), (19), and Theorem 4(ii) yields $\mathbb{V}_{\mu,\sigma}[Y_{N_{\mathcal{Q}(\rho,k)}:1}] = b^2 + O(b^3) = b^2+o(b^2)$ , which completes the proof.
6. Concluding remarks
We have proposed a novel accelerated group sequential sampling scheme with the motivation of saving sampling operations while retaining efficiency. Following the idea of drawing multiple observations at a time sequentially to determine a preliminary sample, and then gathering the rest of the observations all in one batch, we demonstrated the MRPE and BVPE problems under the new sampling scheme as possible illustrations. Furthermore, the new sequential sampling scheme can be applied to deal with other statistical inference problems, including, but not limited to, sequential analogues of Behrens–Fisher problems (see, e.g., [Reference Robbins, Simons and Starr28]), fixed-width confidence intervals (see, e.g., [Reference Hall7]), ranking and selection (see, e.g., [Reference Mukhopadhyay and Solanky23]), bounded-risk point estimation (see, e.g., [Reference Mukhopadhyay and Bapat17]), treatment means comparison (see, e.g., [Reference Mukhopadhyay, Hu and Wang20]), etc.
Due to the appealing properties of our newly developed methodology and the reality of substantial sampling operation savings, it will be of great interest for further investigations into the problems that researchers have recently been working on. A full list would keep going for a while, so we just list a couple here to demonstrate the possible directions: (i) [Reference Schmegner and Baron29] proposed a sequentially planned probability ratio test as a sequentially planned extension of the famous Wald sequential probability ratio test, and (ii) [Reference Mukhopadhyay and Zhuang26] worked on two sample mean comparisons of normal distributions with unknown and unequal variances, where they developed both purely sequential and two-stage methodologies. Our sequential sampling design introduced here can be directly applied to their problem settings, and is expected to save sampling operations significantly.
On another note, recall from Section 3.1 that our loss function for estimating a normal mean has only included the cost per sampled unit. In certain situations, we may have to consider both the cost per sampled group and the cost per sampled unit. Then, a different type of (accelerated) group sequential sampling scheme will be desired. We think it would be a very interesting future research problem to explore, perhaps developing two stopping variables for statistical inference with a minimal cost: one for determining the size of a group, and the other for determining the number of groups needed.
Acknowledgement
The authors thank the editor and the three anonymous reviewers for their valuable comments that helped improve this paper.
Funding information
There are no funding bodies to thank relating to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.