Reportable disease information is a key source for determining the epidemiology of infectious gastrointestinal illness (IGI) in developed countries. However, such data capture information only on diseases deemed reportable, and only a fraction of those due to reporting biases [1, 2]. Not every individual with IGI will seek medical care, not everyone who seeks medical care will provide a stool sample, not all of these will test positive, nor will all positive samples be reported. Internationally, several studies have estimated under-reporting of either IGI, or for specific pathogenic causes [2–5]. The current study brings two important additions to traditional methodology. First, under-reporting has typically been calculated using single-value estimates of the proportion reported at each step, which are then multiplied together and the inverse taken. However, there is considerable uncertainty (i.e. our lack of knowledge about the true values of these proportions) and variability (i.e. inherent randomness in these values) that are not captured by single-value estimates, but which may instead be represented by a probability distribution. It is possible to use simulation methods to link these distributions together to generate an output distribution of the estimated numbers of IGI in the community for each officially reported case. This output distribution includes the combined uncertainty and variability of the input distributions. Second, it is important to understand what biases are systematically built into surveillance data. Ultimately, are there certain characteristics that make someone more or less likely to be counted in reportable disease statistics? While previous studies have assessed factors influencing physician consultation among people with IGI, no studies have assessed this as well as factors affecting physician stool requests within the same population. Thus, the objective of this study was to determine the under-reporting of IGI in the province of British Columbia (BC), Canada, accounting for uncertainty and variability in the data, and to evaluate the characteristics of individuals who move up the reporting chain towards eventual capture in provincial communicable disease statistics.
In order to assess the prevalence and under-reporting of IGI, multiple studies were undertaken by the Public Health Agency of Canada (formerly Health Canada) as part of the National Studies on Acute Gastrointestinal Illness (NSAGI). These studies included: (1) a population survey of IGI in the preceding 28 days, (2) a mail survey of laboratory practices related to processing, diagnostics and reporting, and (3) an assessment of local public health reporting practices.
Detailed methodologies for NSAGI surveys have been reported elsewhere [6–8]. In brief, the population survey consisted of a retrospective, cross-sectional telephone survey administered from June 2002 to June 2003 to randomly selected residents from three public health regions in the province of BC, Canada: one urban region (Vancouver, population 550 000), one rural region (East Kootenay, population 80 000), and one semi-urban and rural region (Northern Interior; population 132 000). The combined population of these regions represented around 19% of the total provincial population; a 44% response rate was achieved. The laboratory survey was mailed to 105 participating private and hospital-based laboratories in BC that test stool specimens for enteric bacteria, parasites or viruses; 89% responded. A mailed survey, which achieved a 92% response rate, was sent to all 19 BC Health Service Delivery areas in existence in 2002.
Cases of acute IGI were defined as individuals who reported any vomiting or diarrhoea of presumed infectious origin in the 28 days prior to the telephone interview. Diarrhoea was defined as any loose stool or stool with abnormal liquidity. If more than one episode of illness was reported, the analysis related only to the most recent episode of IGI. The non-case category included respondents who self-identified with non-infectious IGI due to a medical condition, food allergy, pregnancy or as the result of medication use. A person seeking health care was any ill person who saw a doctor, nurse or nurse practitioner for their illness.
Calculation of under-reporting
Under-reporting of IGI in BC was characterized by estimating the proportion of cases that moved up through each of eight sequential tiers of reporting, conditional on reaching the previous tier (Fig. 1). In order to account for uncertainty around the proportional estimates for each tier, an input distribution for each proportion in the reporting chain was specified. The distributions for each tier were then multiplied together, similar to a series of conditional probabilities, using simulation modelling with 10 000 iterations and Latin Hypercube sampling in @RISK 4.5.2 (Palisade Corporation, New York, NY, USA) as an add-in for Microsoft Excel 2000 (Microsoft Corporation, Redmond, WA, USA). Input distributions (type of distribution and parameters) for the proportion of cases reported at each step of the reporting chain were determined using data from the NSAGI studies described above. Specific input distributions used for each under-reporting step are summarized in Table 1. A sensitivity analysis was conducted to determine which input distribution had the most influence on the generated overall under-reporting distribution. This was done by calculating and ranking correlation coefficients between each of the input distributions and the distribution for overall under-reporting.
Fig. 1. Under-reporting pyramid for gastrointestinal illness in British Columbia. Overall under-reporting was characterized by estimating the proportion of cases that moved up through each of eight sequential tiers of reporting, conditional on reaching the previous tier.
Table 1. Input distributions and ratio estimates of under-reporting of infectious gastrointestinal illness (IGI) in British Columbia
Under-reporting for tiers 1–4 was calculated using data from the BC NSAGI population survey . Crude proportions were calculated as the number of individuals who advanced a tier divided by the number of those eligible to advance. For example, the number of people that went on to seek medical care (numerator) among those that experienced IGI illness (eligible denominator) represented the under-reporting proportion for tier 2. Direct standardization was used to adjust crude provincial proportions in tiers 1–3 to the age and sex distribution of the overall BC population, using 2001 Census data obtained from Statistics Canada . Due to small numbers, standardization of proportions for tier 4 was not performed. BC-specific data on the proportion of stool samples tested by the laboratory (tier 5), the proportion of those positive (tier 6) and the proportion of positive stools that were reported to local public health (tier 7) were obtained from the NSAGI Laboratory survey. Since the laboratory survey collected information about the proportion of stool specimens submitted for bacterial, viral and ova/parasites examinations that tested positive, a uniform distribution for the number of cases with positive results was constructed. The number of specimens tested for each examination (bacterial, viral and ova/parasites) from all BC laboratories was summed as well as the corresponding proportion positive. A minimum value was constructed by assuming the minimum overlap in testing, i.e. that one person had only one test performed. The maximum value was constructed assuming that one person had all three tests performed. In BC, all laboratories are mandated to report to the local public health who in turn report to the province. The minimum and maximum percentage values for the proportion of cases reported to local public health are given by the range of responses collected from 93 participating laboratories. Similarly, BC data from the Public Health Reporting survey were used to estimate the range of proportions that represented the subset of cases reported to local public health that were subsequently reported to the province (tier 8). Since the survey proportions reported by laboratories and public health used in tiers 6–8 were based on the educated guesses of respondents rather than calculated data, uniform input distributions of the minimum and maximum responses were used. This distribution is appropriate when little data is available and provides only a crude reflection of the uncertainty of a parameter since all values within the allowed range have the same constant probability .
Factors affecting under-reporting
Bivariate analyses were conducted to examine the relationship between demographic factors (region of residence, age, ethnicity, education level, household income, number of residents in the household), type of symptoms experienced (nausea, vomiting, diarrhoea, abdominal pain, fever, chills, myalgia, headache), medication use in the 28 days prior to onset (antibiotics, laxatives, antacids, immune suppressing drugs), use of medication to treat the episode of IGI (painkillers, anti-diarrhoeals, antihistamines, oral rehydration solutions, herbal remedies), and other risk factors (travel outside North America in the previous 28 days, use of private water source) with two primary outcomes: whether or not a case sought health care (i.e. advanced from tier 1 to tier 2) and whether or not a person who sought health care was asked to provide a stool sample (i.e. advanced from tier 2 to tier 3). Individuals responding ‘I don't know/unsure’ or ‘refused’ to any question were excluded from the analysis of that question. Differences in mean values among two groups were tested using the Wilcoxon two-sample test in Epi-Info version 6.04d (CDC, Atlanta, GA, USA). The strength of association between categorical predictors and both outcome measures was calculated as relative risks (RR) with 95% confidence intervals (CI). Multivariate analysis of these variables and their interaction terms was done using forward and backward stepwise logistic regression as well as best subsets techniques. When expected cells were <5, differences in proportions were calculated using StatXact® version 6 with Cytel Studio™ (Cytel Statistical Software, Cambridge, MA, USA).
Estimates of under-reporting
Of the people contacted, 44·3% agreed to participate in the telephone survey (4612/10403). Of those, just over 10% (451/4612, 10·79% age adjusted) indicated they had experienced an IGI that met the survey case definition in the 28 days preceding interview. Of those, 10% (52/451, 10·55% age adjusted) sought medical care. Approximately a quarter of these cases were asked to provide a stool sample for testing (12/52, age-adjusted 26·40%) and 83% (10/12) complied. There were significant regional differences in the proportion of people seeking health care for IGI that were asked to submit a stool sample. Request rates were significantly higher in Vancouver (47%) compared with East Kootenay (14%) and Northern Interior (13%, P=0·04, Table 2). Due to borderline significance, and small sample sizes in this tier contributing to a large uncertainty in estimates, a single provincial estimate of under-reporting was constructed.
Table 2. Under-reporting fractions by survey community
The simulation model, which factored in the uncertainty around the proportions at each stage of under-reporting, estimated that for every case of IGI reported to the province each month, the number of cases in the community ranged from 181 to 611 community cases (5th and 95th percentile estimates, respectively) with a median of 316 cases, and a mean and standard deviation of 347 and 140, respectively (Fig. 2). Table 1 displays the mean, median, 5th and 95th percentile values of the distributions generated for under-reporting at each step in the reporting chain, that is, the cumulative number of cases at each step for every one case reported to the province. The parameter that had the most influence on under-reporting rates was the fraction of individuals with IGI asked to submit a stool sample (r=−0·622).
Fig. 2. Distribution of the estimated overall under-reporting rate of infectious gastrointestinal illness in British Columbia, showing the number of cases in the community for each case reported to the province.
Factors affecting under-reporting
Of the 451 individuals with IGI, only 52 (10%) sought medical care for their illness. These individuals differed from those who did not consult a health-care provider in a variety of ways (Table 3). On univariate analysis, individuals who experienced any vomiting during the course of illness were 1·87 times more likely than those with no vomiting to see a health-care provider for their illness; similarly, individuals who self-medicated with anti-nauseants were 2·88 times more likely to seek health care. Individuals with longer durations of either vomiting or diarrhoea were also more likely to seek medical care. Other symptoms including abdominal pain, fever, chills, headache, and tiredness increased the likelihood of consultation. Neither mean age nor the proportion of cases aged <5 years was significantly associated with medical consultation. Forward and backward stepwise logistic regression produced identical results. Only vomiting (OR 2·15, 95% CI 1·03–4·49) and antibiotic use in the previous 28 days (OR 3·59, 95% CI 1·17–10·97) remained significant predictors of health-care-seeking behaviour. The receiver operating characteristic (ROC) value for this model was 0·51, indicating a low predictive value for the independent variables included in the model.
Table 3. Bivariate predictors of advancement up the reporting pyramid
Physicians were more likely to request a stool sample from older patients (mean age 48 vs. 30 years), from patients with fewer members in their households (mean 2·08 vs. 3·10) and were three times more likely to request a stool sample from patients who used anti-diarrhoeals for treatment of their illness. They were 62% less likely to request stool samples from patients who self-identified with a North American cultural group than with all other cultural groups combined (i.e. European, Asian, African, South American, Australasian, Native American). Physicians were significantly less likely to request sample collection for patients presenting with nausea and chills and were 91% less likely to request a stool specimen for those with a history of vomiting.
This study suggests that an average of 350 cases of gastroenteritis occur in BC for every one case captured in provincial communicable disease statistics. This estimate ranged from 181 to 611 community cases (5th and 95th percentile estimates, respectively) reflecting the uncertainty and variability about the estimate. A population study, conducted in Hamilton, Ontario, Canada from February 2001 to February 2002 using the same study methodology, case definitions, and analysis reported a similar mean estimate of 313 community cases per reported case . These results confirm that IGI is highly under-reported in Canadian populations. In BC, these estimates correspond to an incidence rate of 1·3 (95% CI 1·2–1·4) episodes of acute IGI per person-year and 19·7 million annual sick days . As in previous studies, our model is based on the aetiological assessment of IGI through stool sample submission. Although rare, blood and urine cultures may also recover infectious organisms. To the extent that this occurs, the estimates of under-reporting presented here will be slight overestimations. Due to small sample sizes at tier 3, and wide regional variability in physician stool request practices (13–47%), this factor contributed the most to our uncertainty of the overall under-reporting estimate. In the Hamilton study, overall uncertainty was driven by a different factor – the percentage of cases that tested positive. The reason for this difference probably relates to differing sample sizes at a given tier between studies.
Previous international estimates of under-reporting have used varying case definitions making direct comparability difficult. The current study deliberately chose a broad definition of IGI to (1) ensure directly comparable results between Canadian studies, and (2) enable future comparisons with other studies through subsequent re-analysis using more restrictive definitions. A sensitivity analysis using a more restricted definition of diarrhoea (three or more loose stools or stools with abnormal liquidity in 24 h) resulted in only a very minimal change in the estimate of IGI prevalence .
In general, the individuals most likely to seek medical care in this study were those with vomiting as a symptom of illness. Since viral rather than bacterial agents are more likely to induce vomiting as a symptom of infection, this suggests that a disproportionate number of cases of viral illness may be captured in studies that use the physician's office as the setting for monitoring rates of IGI; this is particularly important for studies which aim to evaluate the relative contributions of bacterial, parasitic, and viral pathogens to the burden of IGI in the community. This finding may have been influenced by the year in which the study was conducted. 2003 was marked by a sharp increase in the number of Norovirus infections in the community which may explain why other studies, in earlier times and different locations found higher rates of consultation for individuals with bacterial infections .
Although those with vomiting symptoms were more likely to seek care, physicians were more likely to request a diagnostic sample from patients that presented with diarrhoea, skewing higher stages of under-reporting towards inclusion of IGI of bacterial or parasitic origin. While the biases introduced in these two tiers tend to counteract each other, the overall effect does not appear to be null. In this study, for every community case of IGI with vomiting as the only symptom, there were 5·6 cases with diarrhoea as the sole symptom and 2·2 cases with both presentations. If no bias were present, this ratio should remain constant across all levels of the pyramid. However, by tier 3, 12 people with diarrhoea and two persons with both symptoms were being asked to submit specimens for every case with vomiting. Therefore, individuals with diarrhoeal symptoms were twice as frequently represented by tier 3 compared with IGI estimates at the community level (5·6 vs. 12). Despite this effect on surveillance estimates, this is expected practice from a clinical standpoint. Physicians are less likely to request vomitus specimens as they are difficult to obtain, are often the result of toxin-mediated processes and, therefore, do not often yield a diagnosis unless processed at specialized laboratories.
Population-based studies of gastrointestinal illness in The Netherlands, England, Norway, the United States and Ireland have examined factors associated with physician consultation. These studies suggest that males , children aged <5 years [12–15], adults aged >64 years [13, 15], urban residents , and individuals with a low level of education  or of a low socio-economic class  are all more likely to seek physician consultation. Clinical factors associated with consultation include bloody diarrhoea [14, 17], severe illness  or a long duration of illness [14, 17]. Foreign travel has also been shown to increase the likelihood of consultation . In the present study, no demographic factors were associated with health- care-seeking behaviour. Two clinical factors – experience of vomiting and antibiotic use in the previous 28 days – remained significantly associated with health-care consultation after adjustment for other clinical and demographic variables in the model. Previous antibiotic therapy may simply be a proxy for regular health-care-seeking behaviour to the extent that those who are likely to seek treatment once are more likely to do so again. Although individuals who attributed their IGI symptoms to medication use were excluded as cases, some misclassification may have occurred.
Whether or not a physician requested a stool specimen depended largely on the patient's experience of illness but also on several demographic characteristics. Similarly to US FoodNet findings, physicians were more likely to request stool samples from older patients . In 2000, provincial guidelines were developed and circulated to all BC physicians suggesting stool culture in adult patients with fever >38·5°C, mucous/bloody stool, hypotension, dehydration or severe abdominal pain, prolonged experience of diarrhoea (>7 days duration), immunocompromised, a history of travel to a developing country, and exposure to faecal matter or untreated or potentially contaminated water . Many of these were also cited by BC physicians in a recent survey of factors influencing stool request behaviour . In contrast to the guidelines and self-reported physician behaviour, our population-based study did not suggest these factors as predictors of a stool request. What did prompt a test request was the use of over-the-counter anti-diarrhoeals to treat illness, region of residence and ethnicity of non-North American origin. Other factors may not have been able to be detected due to the small sample size associated with tier 3. Given that the temporal relationship between use of anti-diarrhoeals and test request is unknown, its predictive ability is questionable. For physicians, anti-diarrhoeal use may have served as an indicator of illness severity and prompted a test request or may have been prescribed following such a request. Health-care providers were significantly less likely to request stool samples from North American patients. Ethnicity, in this study, was not associated with foreign travel (P=0·11) and therefore did not confound a possible association between travel and stool request behaviour. Stool sample request rates were significantly higher in Vancouver (47%) compared with East Kootenay (14%) and Northern Interior (13%) (Table 1). Although Vancouver has a high multi-ethnic population, a factor associated with stool requests, there was no significant difference in ethnicity between regions (P=0·11). Interestingly, in a concurrent self-reported physician survey , East Kootenay physicians reported the highest proportion of stool requests, contradicting these population-based study findings that suggest Vancouver physicians request stool samples with the highest frequency.
This study confirms the degree to which provincial surveillance statistics are under-reported and provides insight into some of the individual characteristics that make an individual more likely to move up through sequential tiers of assessment towards final capture in provincial IGI statistics. To our knowledge, this is the first study to use the same population survey from which under-reporting estimates were derived to examine characteristics of under-reporting at multiple tiers using multiple regression techniques. Although limited by smaller sample sizes at higher tiers of the pyramid, it confirmed our suspicions that, when compared with baseline occurrence, IGI of viral origin is significantly more under-reported than IGI of bacterial origin. Larger sample sizes following similar methods would allow a more robust assessment and perhaps detect additional characteristics of individuals that predict their advancement to higher levels of the reporting pyramid. A better understanding of such characteristics will ultimately allow public health to adjust biased surveillance data, taking into account differential degrees of under-reporting as a function of certain demographic or clinical characteristics.