## Introduction

Given the extent of global travel patterns [Reference Glaesser1–Reference Kraemer3], newly emerging diseases can rapidly spread globally. In general, respiratory pathogens spread faster [Reference Fineberg4] than vector-borne viruses [Reference Redondo-Bravo5, Reference Halstead and Wilder-Smith6] or those that require very close contact such as Ebola [Reference Tuite7] or Lassa [Reference Wolf8]. In late 2019, a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged in Wuhan, China, which rapidly spread globally [Reference Bogoch9] and caused epicentres of COVID-19 disease in multiple countries. SARS-CoV-2 has a high reproduction rate and is easily transmitted via respiratory droplets among humans [Reference Liu10]. Population flow data between Wuhan and other major cities in mainland China were clearly correlated with the number of cases exported from Wuhan to other city clusters in mainland China before the lock-down [Reference Lau11]. The potential for rapid international spread via air travel was enormous, and indeed first exportations followed high travel volumes to Thailand, Hong Kong and Singapore [Reference Bogoch9]. As of 1 June 2020, more than 6 million cases and more than 360 000 deaths due to COVID-19 have been reported in more than 200 countries.

The speed of spread depends on the air passenger volumes, the basic reproduction rate as a measure of transmissibility and the incubation time [Reference Massad12, Reference Lopez13]. In this paper we present two complementary methods for calculating the risk of international spread of a new virus from an epicentre. The first method aims to calculate the number of cases that would be exported from an endemic country to disease-free regions by travellers. The second method calculates the probability that one of the infected travellers will generate at least one secondary autochthonous case in the visited country. The calculation for disease exportation is simpler than the calculation for infection importation. One difference is that in the case of disease importation travellers to endemic areas return infective to their home country, whereas in the case of disease exportation travellers depart from their endemic home country in a latent state. This latter assumption is based on the conjecture that symptomatic individuals do not travel. For asymptomatic travellers, their disease will manifest itself either during the flight or after arrival in the visited disease-free country dependent on the time of infection and incubation time.

In the case of disease importation, the key parameter is the force of infection of the disease in the visited endemic country. In the case of disease exportation, the key parameter is the latency duration of the disease in the travellers' home country. In the case of disease importation latency is not too important and the model considers only susceptible, infected and removed individuals. On the other hand, in the case of disease exportation latency is important because it is assumed that infected and symptomatic individuals are either so sick that they do not manage to travel or are not allowed to board the plane due to exit screening.

## Methods

### The models

#### Model 1. Calculating the number of exported cases from an endemic country

In this section we consider the case of infective travellers from an endemic country visiting a disease-free country, and so exporting the infection to the visited country. Once arriving in the visited disease-free country those infective visitors may trigger an outbreak that can establish itself depending on the value of the basic reproduction number *R* _{0} of the infection in the disease-free country. If *R* _{0} is greater than one, the disease will spread. We will approach the problem with a deterministic formulation.

The model is a classic susceptible-exposed-infected-removed (SEIR) model given by the following set of equations:

where *S* _{H}(*t*) is the number of susceptible individuals, *E* _{H}(*t*) is the number of incubating and asymptomatic individuals, who have the disease but do not transmit it, *I* _{H}(*t*) is the number of infectious individuals, *R* _{H}(*t*) is the number of individuals recovered from infection and *N* _{H}(*t*) = *S* _{H}(*t*) + *E* _{H}(*t*) + *I* _{H}(*t*) + *R* _{H}(*t*) is the total population. The parameters are *β*, the potentially infective contact rate, *δ* _{H}, the inverse of the incubation (or latency) period, *γ* _{H}, the duration of infectiousness and *μ* _{H} and *α* _{H} are the natural and disease-induced mortality rates, respectively.

Hence, the number of new infections per unit of time corresponds to the infection incidence, denoted*λ*(*t*) = *βS* _{H}(*I* _{H}(*t*)/*N* _{H}(*t*)).

The basic reproduction number, *R* _{0}, that is, the number of secondary infection produced by an infectious individual in an entirely susceptible population along his/her infectiousness period, associated with system (1) is deduced in Appendix A:

For exportations, our interest is the prevalence of latent infections in the local population, from which, some individuals will travel already infected but not yet symptomatic. We estimated the disease prevalence in the population, that is, the number of infected individuals at each instant of time, *I* _{H}(*t*), by integrating the third equation of equation (1) to obtain [Reference Kraemer3]:

Dividing *I* _{H}(*t*) by the size of the local population, *N* _{H}, we obtain the prevalence, that is, the proportion of infectious individuals, *p* _{I}(*t*), in the endemic country as follows:

On the other hand, multiplying the number of visitors to a given disease-free country by the prevalence of latent (infected but not infectious individuals), *p* _{E}(*t*), generates the number of infected visitors or exportations of infections. Integrating the second equation of (1) yields the following quantity *E* _{H}(*t*), exposed or latent individuals:

Dividing *E* _{H}(*t*) the total population *N* _{H}, yields the prevalence of infected but not yet infectious individuals in the home country as follows:

To obtain this prevalence, the force of infection of the disease, that is, the number of new cases of infection per time unit, *β*(*I* _{H}(*t*)/*N* _{H}), in this endemic region is a necessary input variable. The best information normally available is the notification rate of infectious individuals, *δ* _{H}*E* _{H} (this term is the number of individuals that evolve from the latent to the infectious state), provided by disease surveillance systems. Equation (6) will be used later in the paper.

#### Model 2. Calculating the probability of infection introduction in a disease-free country

In this section we calculate the probability that an infected traveller (index case) from an endemic country arriving infective in a disease-free country generates a secondary autochthonous case.

On arriving in the disease-free country each infected visitor will trigger an outbreak that will establish itself depending if the value of the basic reproduction number *R* _{0} of the infection is greater than one. Since we are dealing with a low number of travellers, we need to approach the problem with a stochastic formulation.

This model assumes that a density of one infected individual, *I* _{H}(*t* _{0}), arrives at *t* = *t* _{0} and remains infective for a period of (*μ* _{H} + *γ* _{H} + *α* _{H})^{−1} days, that is

where *μ* _{H}, *γ* _{H} and *α* _{H} are the natural mortality rate, the recovery rate from infection and the disease-induced mortality rate, respectively. If the region where these infected travellers arrive had an area *A* the number of them is *I* _{H}(*t* _{0})*A*.

The total number of new cases infected by these travellers, Δ weeks after its introduction, New Cases, is given by

where *β* is the potentially infective contact per unit time between infected and one susceptible individuals, and *S* _{H} and *N* _{H} are the susceptible and total population, respectively.

The risk of New Cases invasion of a previously unaffected country, Risk_{new cases}, can be defined as the probability that at least one autochthonous case be produced by the arrival of one single infected individual at the area during his/her infectiousness period. For calculating this risk, we assumed a non-homogeneous simple birth process [Reference Redondo-Bravo5], which describes the propagation of the disease.

Let *P* _{n}(*t*) be the probability of *n* cases. The probability generating function of such process is $P\lpar x\comma \;t\rpar = \sum\nolimits_n {P_nx^n}$. After some calculation, we obtain the probability of *x* cases at time *t* [Reference Redondo-Bravo5, Reference Halstead and Wilder-Smith6]:

where *a* = *I* _{H}(*t* _{0})*A*, *λ*(*t*) = *β*((*I* _{H}(*t* _{0})*S* _{H}(*t*))/*N* _{H}) and $\rho \lpar t\rpar = \int_0^t {-\lambda \lpar \tau \rpar \theta \lpar \tau -t_0\rpar {\rm d}\tau }$.

We have assumed that the region to be studied has an area *A*, the number of infected travellers that arrived at *t* = *t* _{0} is *a* = *I* _{H}(*t* _{0})*A*. We set *a* = 1 from now on, that is, a single index case arrives at the non-affected area.

Expanding (9) in powers of *x* we find that the risk, that is, the probability of having *n* infected individuals at time *t*, denoted Risk_{new cases}(*n*, *t*) as

*a*)$${\rm Ris}{\rm k}_{{\rm new}\;{\rm cases}}\lpar n\comma \;t\rpar = \sum\limits_{\,j = 0}^{\min \lpar n\comma a\rpar } {\left({\matrix{ a \cr j \cr } } \right)\left({\matrix{ {a + n-j-1} \cr {a-1} \cr } } \right)} \pi ^{a-j}\sigma ^{n-j}\lpar 1-\pi -\sigma \rpar ^j$$

The risk (probability) of having no infected individuals is

*b*)$${\rm Ris}{\rm k}_{{\rm new}\;{\rm cases}}\lpar 0\comma \;t\rpar = \pi ^a$$

In equations (9a) and (9b) $\pi =1-({1}/({e^{\rho (t)}+ \int_{0}^{t}\lambda (\tau )\theta (\tau -t_{0}) $$ e^{\rho (t)}d\tau )})$ and $\sigma =1-({e^{\rho (t)}}/({e^{\rho (t)}+\int_{0}^{t}\lambda (\tau )\theta (\tau -t_{0})e^{\rho (\tau )}d\tau )})$.

The probability of at least one autochthonous case in a previously unaffected region can be calculated as the tail probability, that is, the probability of the infection invading the previously non-affected area:

for *m* *=* 1, equation (5) reduces to

## Results. Illustrating the models

To illustrate the models' performance, we consider the case of the outbreak of COVID-19 in the province of Hubei, China. At the time of writing, this province was responsible for approximately 82% of the total number of COVID-19 cases in the world.

We used data from the WHO website [Reference Tuite7]. As we use the case of Hubei outbreak only to illustrate the models, we calculated the incidence of cases in that province by multiplying the total world number of daily cases of infections by 0.84. As China modified the diagnostic criteria along the course of the outbreak, we used incidence data only until 11 February 2020.

### Illustrating model 1

We begin by fitting a continuous function to the daily number of reported cases, that is, the incidence of new cases per time unit, *δ* _{H}*E* _{H}(*t*). We assume here that new cases are symptomatic cases. This assumption is probably very reasonable for the Hubei epidemic because tests were developed only at the end of the Hubei outbreak. The function has the bell-shaped form:

where *c* _{i}, (*i* = 1, …, 3) are the fitting parameters and *t* is the time.

The fitting of reported new cases per time unit to equation (11) is shown in Figure 1.

Note that equation (12) fits the Hubei COVID-19 cases reasonably well. If equation (12) is inserted into equation (4), the COVID-19 prevalence at each instant of time *I* _{H}(*t*) is obtained.

On the other hand, we can fit the initial exponential phase (e^{ϕt}) of the prevalence curve obtained in Figure 1. It is then possible to estimate the value of *R* _{0} according to (see Appendices A and B):

where φ is the rate of the exponential growth and $\mu _{\rm H}\comma \;\alpha _{\rm H}\comma \;\gamma _{\rm H}$ and *δ* _{H} as in equation (1). The parameter values used to calculate *R* _{0} are shown in Table 1.

The value of the estimated *R* _{0} resulted in an average of 5.31 (ranging from 4.08 to 7.91) for the outbreak in the province of Hubei. Figure 2 shows the fitting of data from Hubei province to an exponential function.

It is possible, in principle, to fit the parameters of system (1) in order to retrieve the prevalence curve. The parameters then can be used to estimate the number and the prevalence of latent individuals (equations (5) and (6)). Alternatively, taken the COVID-19 average latency period of 3 days (i.e. *δ* _{H} = (1/3) per day), *E* _{H}(*t*) can be calculated by simply dividing equation (11) by *δ* _{H}, that is,

from which it is possible to estimate the prevalence of asymptomatic latents in the population:

The result for the case of COVID-19 in the province of Hubei is shown in Figure 3.

As an example in a cohort of travellers that depart from Hubei at week 15 the relative number of latent individuals carrying the COVID-19 virus is of 0.75 individuals per 1000 travellers, which is much higher than Ebola, for instance [Reference Tuite7]. In other words, out of 1333 travellers from that region, 1 would be infected.

### Illustrating model 2

Next, we assume that one individual traveller from an endemic country (index case) visits a disease-free country and remains infective for a period of (*μ* _{H} + *γ* _{H} + *α* _{H})^{−1} days.

To calculate the probability of a secondary autochthonous case generated by each infected traveller, we used the incidence curve described above to calculate the probability generating function according to equation (9) where the incidence is represented by the parameter *λ*(*t*).

Next, we calculate the values of parameters *π* and *σ* from equations (9a) and (9b) to estimate the probability that the infected traveller who imported the virus to his/her home country would generate at least one secondary case, according to equation (11). The result is 23%, that is, one single infective traveller would be able to generate at least one secondary case along his/her infectiousness period, with probability of 23%. Note that the expected number of secondary case is the average value of the basic reproductive rate, that is, 5.31.

## Discussion

In this paper we propose two complementary models for calculating the risk of international spreading of the novel coronavirus SARS-CoV-2 from the initial epicentre of COVID-19 in Wuhan, China. One model addresses the case of disease exportation from the epidemic outbreak and considers a certain number of travellers leaving the epidemic region during the incubation period, thereby importing the virus into another country. The model is deterministic and was illustrated with the data from the initial outbreak in the province of Hubei in China.

The first model's simulation resulted in an average *R* _{0} = 5.31 (ranging from 4.08 to 7.91) and a risk of spreading of 0.75 latent individuals per 1000 travellers. If we consider the monthly number of travellers from the city of Wuhan described by Wu *et al*. [Reference Wu, Leung and Leung14] to other Asian countries of around 86 000, we should expect almost 65 cases of the infection to these countries.

The second model addresses the case of the probability of disease introduction in a disease-free country by an index case from the epidemic epicentre. The model considers the situation in which a single infected traveller from an epidemic region, acquires the infection and travels to a disease-free country where he/she can trigger a local outbreak. As we consider a single traveller we approach the case with a stochastic formulation. We simulated the model with the same case of the province of Hubei in China and the results show that one single infective traveller would be able to generate at least one secondary autochthonous case in the visited country, along his/her infectiousness period, with a probability of 23%. This probability should be contrasted with the average number of secondary cases the traveller would generate at his/her home country of 5.31. The latter is the average basic reproduction number of COVID-19 in the community of Hubei and should not be essentially different elsewhere when the population is immunologically naïve and there is a homogenously mixing pattern of contact. In a stochastic context, even when *R* _{0} is greater than 1, there is a probability of extinction of the infection. Moreover, the 23% risk of exportation means the probability that one traveller when arriving in the infectious condition would generate at least one secondary autochthonous case of COVID-19.

Some important limitations are worthwhile mentioning about our approach:

Our model assumes that only latent individuals travel. However, it is possible that some mildly symptomatic cases can escape from the screening measures at the moment of the travel. Moreover, for example, a number of the earliest known exported cases travelled when sick. There are reports of travellers taking antipyretics to mask their fever, and then board the plane. However, the number of patients who travel with mild symptoms is likely to be very small when compared to the non-symptomatic latent individuals and this should not interfere with our results. Furthermore, our results will depend on the incidence of COVID-19 in the departing country. For example, during the peak of the COVID-19 outbreak in Europe, about 3−6% of air passengers were SARS-CoV-2 positive on repatriation flights [Reference Lytras15].

From the modelling perspective an important limitation is the homogeneously mixing assumption. We are well aware of the many heterogeneities involved in transmission of a directly transmitted pathogen like SARS-CoV-2. In addition, the deterministic approach of the exportation model is an approximation of the real dynamics involved in transmission. However, both limitations above do not invalidate the qualitative results of the models. Considering the large number of people involved in the current epidemic the deterministic approach and the homogeneously mixing assumption can be considered as a good first approximation of the problem. However, heterogeneities can be introduced in the model using the techniques described in [Reference Coutinho16], and these heterogeneities could have significant influences in the quantitative results of our model. For instance, if variation in infectiousness would be included, the risk of spread could be lower on average, and the speed of the infection spread could be affected as well.

Equation (1) assumes that only infected individuals, *I* _{H}, are infectious. In fact, at least a fraction *f* of the exposed individuals, *E* _{H}, may be infectious. So *I* _{H} in the first and the second equations of system (1) should be replaced with *I* _{H} + *fE* _{H}. Equation (A6) in Appendix A shows that *e* _{H}(0) and *i* _{H}(0) are related. This artificial feature can be removed by adding to equations (A1) and (A2) the initial infection terms $e_{\rm H}\lpar 0 \rpar {\rm e}^{-\lpar {\mu_{\rm H} + \delta_{\rm H}} \rpar t}$ and $i_{\rm H}\lpar 0 \rpar {\rm e}^{-\lpar {\mu_{\rm H} + \alpha_{\rm H} + \gamma_{\rm H}} \rpar t}$, respectively, and solving them [Reference Liu10].

Another limitation of our approach concerns the data upon which we exemplify our application. We have access only to the global number of cases from the WHO website on COVID-19 and in order to apply the model for the province of Hubei we assumed that that region represents 84% of the global cases. Hence, we assumed a direct proportionality of the cases to simulate the model on these data. In addition, we simulate the model until 11 February because China modified the diagnostic criteria along the course of the outbreak. The incidence curve, however, had already started to wane at that time. Moreover, we used the incidence data for the province of Hubei only to exemplify the models, which could be applied, in principle, to other situations related to the spread of pathogens from outbreak epicentres.

Finally, it should be commented that the model assumes that the potentially infective contact rate *β* in the receptive country of the index cases is the same as in the province of Hubei. In fact, *β* has a remarkable seasonality, being higher during winter time, declining throughout spring time into summer time. Therefore, the probability of autochthonous cases is super-estimated for receptive countries during the summer season.

We believe that the models presented here may present a significant step forward in estimating the risk of importation of the novel coronavirus SARS-CoV-2.

## Financial support

This work was partially supported by the project ZikaPLAN, funded by the European Union's Horizon 2020 research and innovation programme under Grant Agreement No. 734584, by LIM01-HFMUSP, CNPq andFAPESP and Fundacao Butantan.

## Conflict of interest

None.

## Ethical standards

Not applicable (this is a purely theoretical work with no human subject involved).

## Appendix A

**Finding the basic reproduction number given by equation (2)**

Linearising the second and the third equations of the system (1) around the no-disease state, that is, *S* _{H} = *N* _{H} − *S* _{H}, *E* _{H} = 0 + *e* _{H}, *I* _{H} = 0 + *i* _{H} and *R* _{H} = 0 + *r* _{H} when *S* _{H}, *e* _{H}, *i* _{H} and *r* _{H} are small, we obtain

Assuming

we get

The above system has non-trivial solution if the determinant of the unknown *c* _{1} and *c* _{2} are different from zero. We then obtain

Equation (A5) has a positive solution when

is greater than 1. Thus, *R* _{0} is the basic reproduction number.

Also,

## Appendix B

Now we relate the basic reproduction number (equation (A5)) to the rate *φ* of new cases *N* _{c}(*t*) given by equation

The rate *φ* is obtained by fitting equation (B1) to the initial exponential growth shown in Figure 2.

Using equation (A4) we have that the number of new infected cases at *t* = 0 is given by

Since *e* _{H}(*t*) and *i* _{H}(*t*) are given by equations (A3) and (A4) and *λ* is the positive root of equation (A7), we have

and

But

from equation (A2). So,