Hostname: page-component-76fb5796d-zzh7m Total loading time: 0 Render date: 2024-04-26T17:50:40.879Z Has data issue: false hasContentIssue false

Linear and non-linear combination forecasting model of varicella incidence in Chongqing

Published online by Cambridge University Press:  02 August 2021

Hongfang Qiu
Affiliation:
Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing 400016, China
Han Zhao
Affiliation:
Chongqing Municipal Center for Disease Control and Prevention, Chongqing 400042, China
Qi Chen
Affiliation:
Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing 400016, China
Qiyin Wang
Affiliation:
Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing 400016, China
Rong Ou
Affiliation:
Department of Medical Informatics Library, Chongqing Medical University, Chongqing 400016, China
Mengliang Ye*
Affiliation:
Department of Epidemiology and Health Statistics, School of Public Health and Management, Chongqing Medical University, Chongqing 400016, China
*
Author for correspondence: Mengliang Ye, E-mail: yemengliang@cqmu.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Varicella is a highly infectious contagious disease, and Chongqing is one of the high incidence areas in China. To understand the epidemic regularity and predict the epidemic trend of varicella is of great significance to the risk analysis and health resource allocation in the health sector. First, we used the ‘STL’ function to decompose the incidence of varicella to understand its trend and seasonality. Second, we established SARIMA model for linear fitting, and then took the residual of the SARIMA model as the sample to fit the LS-SVM model, to explain the non-linearity of the residuals. The monthly varicella incidence peaks in April to June and October to December. Mixed model was compared to SARIMA model, the prediction error of the hybrid model was smaller, and the RMSE and MAPE values of the hybrid model were 0.7525 and 0.0647, respectively, the mixed model had a better prediction effect. Based on the study, the incidence of varicella in Chongqing has an obvious seasonal trend, and a hybrid model can also predict the incidence of varicella well. Thus, hybrid model analysis is a feasible and simple method to predict varicella in Chongqing.

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press

Introduction

Varicella is a highly infectious contagious disease caused by varicella-zoster virus [Reference Wang1]. Relevant studies have shown [Reference Brisson2, Reference Russell3] that varicella has obvious seasonality, with one or two peaks per year, often breaking out in winter and spring in temperate regions. In Spain, the incidence of varicella peaked from May to July, with a low incidence in October [Reference Perez-Farinos4]. Giammanco et al. showed that varicella was one of the common childhood diseases [Reference Giammanco5]. In China, Bao et al. [Reference Bao6], Cao et al. [Reference Cao7] and Bai et al. [Reference Bai8] have described the epidemic situation of varicella in Wuhan, Wuxi and Shenyang, respectively. Their studies have shown that the incidence of varicella has obvious seasonality, and it mostly happens in student groups. According to the literature [Reference Dong9], a total of 3 047 715 cases of varicella were reported from 2016 to 2019, including 30 deaths in China. The annual reported incidence and mortality rates were 5505/100 000 and 0.0005/100 000, respectively. In 2018, the incidence of varicella in Chongqing ranked second in China, with a rate of 120.50/100 000, second only to Jiangsu Province. Chongqing is the largest city and economic centre in Southwest China. In 2018, the permanent resident population of Chongqing was about 31.02 million, and the proportion of children aged 0–14 is about 16.93%. To analyse the characteristics of varicella epidemic and select the appropriate prediction model to forecast the incidence of varicella in Chongqing, so as to provide an important epidemiological basis for the prevention and control of varicella in the future, is the current issue to be discussed.

For the prediction models of varicella, in foreign, Soysal et al. conducted a temporal trend study on the incidence of varicella in Turkey [Reference Soysal10]. Giraldo et al. used an infectious disease dynamic model to conduct a preliminary study of varicella [Reference Deguen11Reference Giraldo13]. Lee et al. discussed the incidence of varicella in South Korean children [Reference Lee14]. In China, there are more descriptive studies on varicella [Reference De15Reference Li18]. Some scholars used the infectious disease dynamics model to predict the varicella in Changsha [Reference Pang19] and analysed the spatial aggregation of varicella in Jilin province [Reference Xiong20], while others used ARIMA model [Reference Chen21] and grey model [Reference Chen22] to predict the incidence of varicella. In general, the SARIMA model can only analyse the linear information, but cannot deal with the non-liner information [Reference Wu23]. However, the least squares support vector machine (LS-SVM) is a kind of support vector model (SVM), which is not only suitable for small samples, but also can solve non-linear information well [Reference Alwee24].

Considering the advantages and disadvantages of the prediction methods and the amount of research data, a single prediction model and a combined prediction model were established, respectively, based on the varicella data, and the seasonality of varicella was analysed. By comparing the prediction errors of different models, the best prediction model was selected. The best prediction model was used for short-term prediction to provide reference information for the prevention and intervention of varicella in Chongqing.

Materials and methods

Materials

The monthly incidence of varicella in Chongqing from January 2014 to December 2018 was studied in this paper, and the monthly incidence of varicella data is primarily gained from the Chongqing CDC.

Methods

SARIMA model

Compared with the ARIMA model, the SARIMA model introduces one more seasonal effect, and the modelling process is similar to the ARIMA model. The SARIMA expression is [Reference Qiu25]

$$\eqalign{\nabla ^d\nabla _S^D x_t = & \displaystyle{{\Theta ( B) \Theta _S( B) } \over {\Phi ( B) \Phi _S( B) }}\varepsilon _t \cr \Theta ( B) = & 1-\theta _1B-{\bullet}{\bullet} \bullet{-}\theta _qB^q \cr \Phi ( B) = & 1-\phi _1B-{\bullet}{\bullet} \bullet{-}\phi _pB^p \cr \Theta _S( B) = & 1-\theta _1B^S-{\bullet}{\bullet} \bullet{-}\theta _QB^{QS} \cr \Phi _S( B) = & 1-\phi _1B^S-{\bullet}{\bullet} \bullet{-}\phi _PB^{PS}} $$

B is the backward shift operator, ɛt is the estimated residual at time t with zero mean and constant variance and x t denotes the observed value at time t (t = 1, 2 …k), s is the length of the seasonal period, p, P, d, D, q and Q are the autoregressive order, seasonal autoregressive order, number of difference, number of seasonal difference, moving average order and seasonal moving average order, respectively.

SARIMA model modelling steps

First, judge the stationarity of the sequence, and make the sequence stable through appropriate methods. Second, according to the tailing and truncation of the autocorrelation coefficient and partial autocorrelation coefficient, determine the four main parameter values of the model (p, q, P, Q). Then, residual and parameter tests were carried out for the model. Compare the AIC and BIC values between the models, and choose the optimal model with the smallest two index values. Finally, the optimal model was used for prediction.

Hybrid model

The difference between the optimal SARIMA model-fitting value $\mathop {y_i}\limits^\wedge$ and the actual value y i constitutes the residual sequence $e_i = \mathop {y_i}\limits^\wedge {\kern 1pt} {\kern 1pt} -y_i$, and normalise the residual sequence [Reference Zhu26], then, fitting the LS-SVM model with the residual as the sample. Assuming a training set (x i, y i), x ∈ R, y ∈ R, i = 1, 2, ⋅ ⋅ ⋅ , l, of l data, x i is the input data, y i is the output data, and the objective optimisation function of the LS-SVM algorithm is:

$$\eqalign{\min J( \omega , \;e) = & \displaystyle{1 \over 2}\omega ^T\omega + \displaystyle{1 \over 2}\gamma \sum\limits_{i = 1}^l {e_i^2 } \cr s.t.y_i = & \omega ^T\phi ( x_i) + b + eMi, \;i = 1, \;2, \;\cdots , \;l} $$

In the formula, ϕ( • ):R n → R nh is the kernel space mapping function; e i is the error variable; γ is the adjustment parameter factor.

Sample data normalisation formula:

$$x^\ast{ = } \displaystyle{{x_i-x_{\min }} \over {x_{\max }-x_{\min }}}$$

Anti-normalisation formula:

$$x{\prime} = ( x_{\max }-x_{\min }) \mathop x\limits + x_{\min }$$

where x i is sample data, x max, x min are the maximum and minimum values of the sample data, respectively, x* is the normalised data, $\mathop x\limits$ is the predicted value, x is the anti-normalisation value.

The root mean square error (RMSE) and mean absolute percentage error (MAPE) were used to compare the fitting effect. The RMSE and MAPE calculation formulas are [Reference Qiu25]:

$${\rm RMSE} = \sqrt {\displaystyle{1 \over n}\sum\limits_{t = 1}^n {{( x_t-\mathop {x_t}\limits^\wedge ) }^2} } $$
$${\rm MAPE} = \displaystyle{1 \over n}\sum\limits_{t = 1}^n {\displaystyle{{\vert x_t-\mathop {x_t}\limits^\wedge \vert } \over {x_t}}} $$

In the above equation, $\mathop {x_t}\limits^\wedge$ is the actual incidence value, $\mathop {x_t}\limits^\wedge$ is the estimated incidence value, n is the amount of months for forecasting. The lower the RMSE value and MAPE value, the better the data fitting effect.

Results

Descriptive analyses

Table 1 shows that this study reported 112 273 varicella cases in the past 5 years (2014–2018), in Chongqing, including 58 897 males and 53 376 females, and a male-to-female ratio of 1.1034:1. Varicella mostly occurs within the ages of 0–9 years (n = 63 275), what is more, the age group of 0–9 accounted for 56.36% of all reported cases. The highest percentage of varicella cases was found in students, which amount to 60.74% (n = 68 200), followed by children in kindergarten and scattered children.

Table 1. Distribution of varicella by sex, age and occupation in Chongqing from 2014 to 2018

SARIMA model construction

This study used the ‘STL’ function to decompose the sequence, Figure 1 shows that the sequence has obvious seasonality, and the incidence rate presents an upward trend over time. Table 2 shows that the peak incidence of varicella was from April to June and October to December in Chongqing, and the seasonal index was >1. According to the time series diagram (Fig. 2), the monthly incidence of varicella presented a non-stationary state. After the difference processing of the original sequence, the data presented a stationary state (Fig. 3), and the unit root test showed that the sequence was stationary (P < 0.05). From the autocorrelation and partial autocorrelation graphs of the sequence (Fig. 4), the autocorrelation coefficient and partial autocorrelation coefficient showed tailing. Considering that the value of p, q, P and Q does not exceed 2, we verify the four parameters from 0 to 2, respectively. Only six models passed the residual test and parameter test, the six models were SARIMA(1, 1, 1) × (1, 1, 0)12, SARIMA(2, 1, 2) × (1, 1, 1)12, SARIMA(1, 1, 1) × (1, 1, 1)12, SARIMA(2, 1, 1) × (1, 1, 1)12, SARIMA(2, 1, 2) × (1, 1, 0)12, SARIMA(1, 1, 1) × (0, 1, 1)12, respectively. By comparing the AIC, BIC values and two error indicators of the six models in Table 3, SARIMA(2, 1, 1) × (1, 1, 1)12 model is finally selected as the best model in this paper.

Fig. 1. Trend, seasonal and residual components derived from ‘STL’ decomposition of monthly varicella incidence for Chongqing during 2014–2018.

Fig. 2. Reported monthly incidence of varicella from January 2014 to June 2018.

Fig. 3. Sequence diagram after a one-step difference and seasonal difference with a period of 12.

Fig. 4. Autocorrelation function (ACF) and partial autocorrelation function (PACF) charts of monthly varicella incidence data. (a) ACF; (b) PACF.

Table 2. The seasonal index after the decomposition of ‘STL’ function

Table 3. AIC, BIC values, RMSE and MAPE for different SARIMA models

The data on the incidence of varicella from January 2014 to June 2018 are the training set, a total of 54 data, and the data from July 2018 to December 2018 are the test set data, a total of 6.

Table 4 shows the estimated, standard errors and significance values of model parameters, all the parameter tests were statistically significant. In addition, the P values of LB statistics at order 6 and 12 of delay were 0.9091 and 0.6901, respectively. The white noise test of residuals was significant that indicates the fitted SARIMA(2, 1, 1) × (1, 1, 1)12 model was sufficient. The model equation is given as

$$\eqalign{\nabla \nabla ^{12}x_t & = \displaystyle{{( 1-0.5510B) ( 1-0.9997B^{12}) } \over {( 1-0.5554B + 0.3933B^2) ( 1 + 0.4038B^{12}) }}\varepsilon _t{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} , \;\varepsilon _t \cr& \quad\sim N( 0, \;0.01676)} $$

Table 4. Estimates and standard error of SARIMA(2, 1, 1) × (1, 1, 1)12 model parameters

SARIMA(2, 1, 1) × (1, 1, 1)12 model was used to forecast the incidence of varicella. Table 5 shows the value of prediction; RMSE and MAPE values are 0.7843 and 0.0654, respectively. The actual value of incidence and fitted incidence of SARIMA model monthly is shown in Figure 5. As shown in Figure 5 and Table 5, the tendency and epidemics from predicted incidence are very close to the actual value of incidence and epidemic circumstance of varicella.

Fig. 5. Graph of fitted and predicted values of SARIMA(2, 1, 1) × (1, 1, 1)12 model.

Table 5. Prediction of varicella incidence by two models

Hybrid model construction

First, we took the residual sequence of SARIMA(2, 1, 1) × (1, 1, 1)12 model from January 2014 to June 2018 as the training set, the residual from July 2018 to December 2018 as the test set, and normalise the training samples. Then, we choose RBF kernel function for the LS-SVM kernel function, take different values for the embedding dimension m and the time delay τ, compare the prediction errors, and finally determine that the prediction error is the smallest when m is 3 and τ is 12. That is, using the incidence of the same period in the first 3 years to predict the incidence of the same period in the fourth year, after 50 times, iterative parameter values tend to be stable. Then, sample reconstruction was performed, and the optimal parameters γ and σ were solved by genetic algorithm with the values of 8.8540 and 110.8799, respectively, so as to establish the optimal combination model. Finally, the residual was predicted and the inverse normalisation was carried out to obtain the predicted residual value (Table 6); the predicted value of the monthly incidence of varicella obtained by the combination model was $y^\ast{ = } \mathop {y_i}\limits^\wedge {\kern 1pt} {\kern 1pt} {\rm} + \mathop {e_i}\limits^\wedge$ (Table 5).

Table 6. Residual values predicted by the LS-SVM model

Model comparison

First, compare the fitting effects of the two models. It can be seen from Figure 6 that the fitting value of the mixed model is between the actual value and a single model. Second, a comparison of the prediction effects of the two models, from Table 5 and Figure 7, shows that the mixed model has a slightly smaller value of RMSE and MAPE, and the predicted value of the mixed model is closer to the actual value. Thus it can be seen that the best prediction model is the mixed model.

Fig. 6. Fitting values of the two models.

Fig. 7. Predicted values of two models.

Discussion

The descriptive analysis of varicella shows that the ratio of male and female is approximately equal, and the high incidence of varicella occurs in students, children in kindergarten and scattered children, so the incidence of varicella can be effectively controlled in this age group. The decomposing of the sequence by the ‘ STL’ function not only shows the trend and seasonal changes of the varicella incidence sequence, but also calculates the seasonal index of each month, and can intuitively understand its seasonality. In this paper, we can conclude that the peak incidence of varicella in Chongqing was from April to June and October to December, and the periods from February to March and August to September were two low stages of the disease, which is consistent with relevant studies [Reference Wang27Reference Yang29]. The trough period may be related to the students' winter and summer vacations. During the winter and summer vacations, children's exposure opportunities are significantly reduced. Therefore, it is necessary to strengthen the intervention measures to avoid infection during the high incidence of varicella.

SARIMA model is suitable for the complex interaction among the sequential seasonal effects, long-term trends and random fluctuations. This model is one of the time series analysis models commonly used in the prediction of infectious diseases, such as tuberculosis [Reference Mao30], hand-foot-mouth disease [Reference Tian31], conjunctivitis [Reference Liu32], mumps [Reference Xu33], influenza [Reference Song34] and other infectious diseases. We use the SARIMA model to perform linear fitting on the varicella series. By comparing the AIC, BIC values and combining the RMSE and MAPE values, SARIMA(2, 1, 1) × (1, 1, 1)12 is the best model, and the RMSE and MAPE values of this model are 0.7843 and 0.0654, respectively. It can be seen from the fitting diagram (Fig. 5) that there was a very good match between the observed values and the fitted values, the 95% CI of the forecast value contain all of the real observed data, and SARIMA(2, 1, 1) × (1, 1, 1)12 model can extract the deterministic information in the sequence well. Considering that infectious diseases will be affected by external factors and internal factors of the human body, with irregular changes and non-linear dynamic characteristics, the combined model of SARIMA and LS-SVM combines linear analysis with non-linear analysis.

The SVM has greater potential and better performance in practical applications [Reference De Giorgi35Reference Zhang37]. LS-SVM uses the second square of the error as the loss function, and replaces the inequality constraints with equations to simplify the SVM algorithm, reducing the complexity of the algorithm; furthermore, LS-SVM maintains the advantages of the standard SVM. At present, SVM has been gradually introduced into the field of infectious diseases, such as bacillary dysentery [Reference Xie38], hepatitis B [Reference Qiu25], hand-foot-mouth disease [Reference Zou39] and so on. In this study, we chose the RBF kernel function as the kernel function of the LS-SVM model; compared with other kernel functions, the parameters are easier to choose, the space complexity changed little and it was easy to implement. As can be seen from Table 5, the predicted RMSE value of SARIMA model is 0.7843 and MAPE value is 0.0654, while the predicted RMSE value of the mixed model is 0.7525 and MAPE value is 0.0647. Compared with the single SARIMA model, the mixed model has the advantage of treating the non-linear part of the residual error. In addition, Figure 6 shows that both the single model and the mixed model can well reflect the trend, peak and change trend of the actual varicella incidence. However, the fitting value and predicted value of the mixed model are between the actual value and the single model (Figs 6 and 7), indicating that the prediction effect of the mixed model is better. The mixed model can not only describe the periodicity and seasonal variation of varicella incidence in Chongqing, but also fit the non-linear part well.

In conclusion, although the prediction effect of the model is relatively good, prevention and control work should be done as soon as possible for the high incidence of varicella, strengthen daily disinfection in public places, and large-scale vaccination and other prevention and control measures should also be taken. In order to improve the accuracy of the prediction model, it is necessary to keep updating the data in the future analysis, so that the model can be optimised continuously and reflect the law and development trend of the data.

Conclusions

Based upon the results of this study, applying the hybrid models to forecast the incidence of varicella is feasible. The fitted value and predicted value in the mixed model have the same trend as the actual value of varicella incidence, and the curve is relatively close. It suggests that a hybrid model can be used to predict the incidence of varicella. The short-term prediction of varicella is very effective, which is helpful for the evaluation of prevention or control measures. Meanwhile, we can adopt timely and effective countermeasures for the epidemic peak that may occur.

Acknowledgements

The authors express their thanks to the Chongqing Municipal Center for Disease Control and Prevention for the disease data as well as the help from teachers of Chongqing Medical University.

Author contributions

H.Q. and H.Z. contributed equally to this paper. Conceptualisation, M.Y.; Methodology, H.Q. and H.Z.; Software, H.Q.; Validation, R.O.; Formal analysis, H.Q.; Investigation, H.Q. and H.Z.; Resources, H.Z.; Data curation, Q.C. and Q.W.; Writing – original draft preparation, H.Q.; Writing – review and editing, H.Q., R.O. and M.Y.; Visualisation, Q.C. and Q.W.; Supervision, M.Y. and R.O.; Project administration, M.Y. All authors read and approved the final manuscript.

Financial support

Application and Research of Public Health Emergency Management and On-site Emergency Management System (2020MSXM018).

Conflict of interest

None.

Consent for publication

Not applicable.

Data availability statement

The incidence of varicella data are gained from the Chongqing Center of Disease and Control; it is confidential data and cannot be uploaded to your organisation. The incidence is equal to the number of new cases of a disease in a population during a period divided by the number of people exposed during the same period.

References

Wang, C-L et al. (2016) Investigation and treatment of chickenpox outbreak in a primary school. Journal of Applied Preventive Medicine 22, 350352.Google Scholar
Brisson, M et al. (2001) Epidemiology of varicella zoster virus infection in Canada and the United Kingdom. Epidemiology and Infection 127, 305314.CrossRefGoogle ScholarPubMed
Russell, ML et al. (2005) The changing epidemiology of chickenpox in Alberta. Vaccine 23, 53985403.CrossRefGoogle ScholarPubMed
Perez-Farinos, N et al. (2007) Varicella and herpes zoster in Madrid, based on the sentinel general practitioner network: 1997–2004. BMC Infectious Diseases 7, 59.CrossRefGoogle ScholarPubMed
Giammanco, G et al. (2009) Universal varicella vaccination in the Sicilian paediatric population: rapid uptake of the vaccination programme and morbidity trends over five years. Euro Surveillance 14, 19321.Google ScholarPubMed
Bao, W-B et al. (2019) Analysis on the epidemic characteristics of varicella in a district of Wuhan in 2011 and 2018. Public Health and Preventive Medicine 30, 6770.Google Scholar
Cao, X-P et al. (2019) Epidemiological analysis of chicken pox public health emergency in Liangxi District, Wuxi from 2016 to 2018. Jiangsu Journal of Preventive Medicine 30, 664665.Google Scholar
Bai, S (2020) Epidemic characteristics of varicella among primary school students in Shenyang in 2006 and 2018. Chinese Journal of School Health 41, 148150.Google Scholar
Dong, P-M et al. (2020) Epidemiological characteristics of varicella in China, 2016–2019. Chinese Vaccines and Immunization 26, 403406.Google Scholar
Soysal, A et al. (2021) Incidence of varicella and herpes zoster after inclusion of varicella vaccine in national immunization schedule in Turkey: time trend study. Human Vaccines and Immunotherapeutics 17, 731737.CrossRefGoogle ScholarPubMed
Deguen, S et al. (2000) Estimation of the contact rate in a seasonal SEIR model. Application to chickenpox incidence in France. Statistics in Medicine 19, 12071216.3.0.CO;2-L>CrossRefGoogle Scholar
Costantino, V et al. (2017) Projections of zoster incidence in Australia based on demographic and transmission models of varicella-zoster virus infection. Vaccine 35, 67376742.CrossRefGoogle ScholarPubMed
Giraldo, JO et al. (2008) Deterministic SIR (susceptible-infected-removed) models applied to varicella outbreaks. Epidemiology and Infection 136, 679687.CrossRefGoogle ScholarPubMed
Lee, YH et al. (2019) Increasing varicella incidence rates among children in the Republic of Korea: an age-period-cohort analysis. Epidemiology and Infection 147, e245.CrossRefGoogle ScholarPubMed
De, SK et al. (2015) Herpes simplex virus and varicella zoster virus: recent advances in therapy. Current Opinion Infectious Diseases 28, 589595.CrossRefGoogle ScholarPubMed
Lin, F et al. (2000) Epidemiology of primary varicella and herpes zoster hospitalizations: the pre-varicella vaccine era. Journal of Infectious Diseases 181, 18971905.CrossRefGoogle ScholarPubMed
Walker, J-L et al. (2017) Trends in the burden of varicella in UK general practice. Epidemiology and Infection 145, 26782682.CrossRefGoogle ScholarPubMed
Li, X-F (2018) Analysis on the Epidemiologic Characteristics of Varicella in Qingdao from 2007 to 2016. Shandong: Qingdao University.Google Scholar
Pang, F-R (2019) Application of Infectious Disease Dynamics Model in Early Warning of Varicella Epidemic Situation and Evaluation of Intervention Measures in Changsha City. Hunan: Hunan Normal University.Google Scholar
Xiong, S-H (2020) Research on Epidemiological Characteristics and Spatial Aggregation of Chickenpox in Jilin Province during 2009–2018. Jilin: Jilin University.Google Scholar
Chen, J et al. (2019) Epidemiological characteristics and epidemic prediction of varicella in Xicheng District of Beijing (2013-2018). Public Health and Preventive Medicine 30, 118121.Google Scholar
Chen, X-N et al. (2014) Application of grey model for prediction of chicken pox incidence in Baiyun District of Guangzhou. China Tropical Medicine 14, 746748.Google Scholar
Wu, W et al. (2015) Comparison of two hybrid models for forecasting the incidence of hemorrhagic fever with renal syndrome in Jiangsu Province, China. PLoS ONE 10, e0135492.CrossRefGoogle ScholarPubMed
Alwee, R et al. (2013) Hybrid support vector regression and autoregressive integrated moving average models improved by particle swarm optimization for property crime rates forecasting with economic indicators. The Scientific World Journal 2013, 951475.CrossRefGoogle ScholarPubMed
Qiu, H-F et al. (2020) Forecasting the incidence of acute haemorrhagic conjunctivitis in Chongqing: a time series analysis. Epidemiology and Infection 148, e193.CrossRefGoogle ScholarPubMed
Zhu, J et al. (2012) Study on a forecasting model for infectious disease incidence rate based on least squares support vector machine. Occupation and Health 28, 26622664.Google Scholar
Wang, J et al. (2017) Analysis on public health emergencies at schools in Wanzhou District of Chongqing from 2011–2015. Occupation and Health 33, 24352437.Google Scholar
Deng, W-W et al. (2017) Epidemiological characteristics of varicella in Nan'an district of Chongqing, 2012–2016. Modern Preventive Medicine 44, 34703474.Google Scholar
Yang, X-J et al. (2020) Epidemiological characteristics of varicella in Qijiang District of Chongqing from 2015 to 2018. Parasitoses and Infectious Diseases 18, 8996.Google Scholar
Mao, Q et al. (2018) Forecasting the incidence of tuberculosis in China using the seasonal auto-regressive integrated moving average (SARIMA) model. Journal of Infection and Public Health 11, 707712.CrossRefGoogle ScholarPubMed
Tian, C-W et al. (2019) Time-series modelling and forecasting of hand, foot and mouth disease cases in China from 2008 to 2018. Epidemiology and Infection 147, e82.CrossRefGoogle ScholarPubMed
Liu, H et al. (2020) Forecast of the trend in incidence of acute hemorrhagic conjunctivitis in China from 2011–2019 using the Seasonal Autoregressive Integrated Moving Average (SARIMA) and Exponential Smoothing (ETS) models. Journal of Infection and Public Health 13, 287294.CrossRefGoogle ScholarPubMed
Xu, Q-Q et al. (2017) Forecasting the incidence of mumps in Zibo city based on a SARIMA model. International Journal of Environmental Research and Public Health 14, 925.CrossRefGoogle ScholarPubMed
Song, X et al. (2016) Time series analysis of influenza incidence in Chinese provinces from 2004 to 2011. Medicine (Baltimore) 95, e3929.CrossRefGoogle ScholarPubMed
De Giorgi, M et al. (2014) Comparison between wind power prediction models based on wavelet decomposition with Least-Squares Support Vector Machine (LS-SVM) and Artificial Neural Network (ANN). Energies 7, 52515272.CrossRefGoogle Scholar
Liu, B-C et al. (2017) Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): a case study of Beijing-Tianjin-Shijiazhuang. PLoS ONE 12, e0179763.CrossRefGoogle ScholarPubMed
Zhang, X-Y et al. (2014) Applications and comparisons of four time series models in epidemiological surveillance data. PLoS ONE 9, e88075.CrossRefGoogle ScholarPubMed
Xie, H-C et al. (2013) Application of a support vector machine on the prediction of the incidences of infectious diseases. Modern Preventive Medicine 40, 41054112.Google Scholar
Zou, J-J et al. (2019) Application of a combined model with seasonal autoregressive integrated moving average and support vector regression in forecasting hand-foot-mouth disease incidence in Wuhan, China. Medicine (Baltimore) 98, e14195.CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Distribution of varicella by sex, age and occupation in Chongqing from 2014 to 2018

Figure 1

Fig. 1. Trend, seasonal and residual components derived from ‘STL’ decomposition of monthly varicella incidence for Chongqing during 2014–2018.

Figure 2

Fig. 2. Reported monthly incidence of varicella from January 2014 to June 2018.

Figure 3

Fig. 3. Sequence diagram after a one-step difference and seasonal difference with a period of 12.

Figure 4

Fig. 4. Autocorrelation function (ACF) and partial autocorrelation function (PACF) charts of monthly varicella incidence data. (a) ACF; (b) PACF.

Figure 5

Table 2. The seasonal index after the decomposition of ‘STL’ function

Figure 6

Table 3. AIC, BIC values, RMSE and MAPE for different SARIMA models

Figure 7

Table 4. Estimates and standard error of SARIMA(2, 1, 1) × (1, 1, 1)12 model parameters

Figure 8

Fig. 5. Graph of fitted and predicted values of SARIMA(2, 1, 1) × (1, 1, 1)12 model.

Figure 9

Table 5. Prediction of varicella incidence by two models

Figure 10

Table 6. Residual values predicted by the LS-SVM model

Figure 11

Fig. 6. Fitting values of the two models.

Figure 12

Fig. 7. Predicted values of two models.