• Different combinations of diagnostic tests have been used in pneumococcal vaccine trials.
• The estimated accuracy of composite diagnostic standards varied substantially between trials.
• Pneumococcal vaccine efficacy estimates are uncomparable and their pooled estimates are biased.
Pneumococcal pneumonia (PP) is a major cause of morbidity and mortality among adults. A 23-valent pneumococcal polysaccharide vaccine (PPV23) has been recommended for adults aged ⩾65 years to prevent invasive pneumococcal diseases in many countries, while its protective efficacy against PP remains to be questioned [Reference Moberley1–Reference Falkenhorst5]. Recently, a vaccine trial demonstrated the protective efficacy of 13-valent pneumococcal conjugate vaccine (PCV13) against vaccine-type PP among older adults [Reference Bonten6]. However, there is no study that formally compared the efficacy of PPV23 and PCV13.
One of the major limitations in pneumococcal vaccine trials is a lack of gold standard diagnostics for PP [Reference Jokinen7]. Almost all tests for identifying pneumococcus from blood, sputum and urine samples are imperfect [Reference Said8–Reference Sinclair10]. Blood culture has been a gold standard for pneumococcal bacteremia; however, its sensitivity for diagnosing PP is very low because only up to a quarter of PP cases are bacteremic [Reference Said8]. Culture and polymerase chain reaction-based methods using sputum samples are believed to be less specific despite an absence of supporting evidence. A commercial urinary immunochromatographic test for pneumococcal antigen (ICT) is sufficiently specific (93–100%) but less sensitive (67–82%) and its test accuracy varies by settings [Reference Sinclair10].
To overcome this limitation, pneumococcal vaccine studies often use a composite diagnostic standard [Reference Cordoba11]: pneumonia patients are screened by multiple diagnostic tests and the diagnosis of PP is made if any of the tests show positive result. A use of multiple imperfect tests increases the overall sensitivity but decreases the overall specificity [Reference Naaktgeboren12] and a use of inaccurate diagnostic test underestimates true vaccine efficacies (VEs [Reference Lachenbruch13, Reference Orenstein14]. However, because of the absence of reference standards and analytical methods, to the best of our knowledge, no study has evaluated the performance of composite diagnostic standard as an outcome in pneumococcal vaccine trials. In this study, we estimated the accuracy of composite diagnostic standards for PP from previous trial results by using a novel approach.
To establish formulas to calculate the sensitivity and specificity of an outcome measurement, we used a simple randomised controlled trial (RCT) model similar to that used in a previous study [Reference Lachenbruch13]. In this model, vaccinated and unvaccinated people are followed up during a specified period. If they develop all-cause pneumonia (ACP), samples are collected and tested for pneumococcus. The VE was calculated as a 1-risk ratio.
The observed VE against ACP (ve a), observed VE against PP (ve p) and true VE against PP (ve π) are described using the following parameters:
a c = observed risk of ACP in unvaccinated individuals.
a v = observed risk of ACP in vaccinated individuals.
p c = observed risk of PP in unvaccinated individuals.
p v = observed risk of PP in vaccinated individuals.
π c = true risk of PP in unvaccinated individuals.
π v = true risk of PP in vaccinated individuals.
Se = test sensitivity for diagnosing PP.
Sp = test specificity for diagnosing PP.
To simplify the following discussion, we introduce four assumptions:
Assumption 1 (A1): the misclassification in the diagnosis of ACP is non-differential.
Assumption 2 (A2): the pneumococcal vaccine does not change the risk of non-PP.
Assumption 3 (A3): the directions of ve a and ve p are identical and the value of ve p is equal to or greater than that of ve a (i.e., 0 <ve a ⩽ ve p or ve p ⩽ ve a <0).
Assumption 4 (A4): the pneumococcal vaccine does not affect Se and Sp.
Then, Se and the minimum value of Sp (Sp min) are given as follows (technical details are provided in Supplementary materials):
We conducted a systematic literature review to identify RCTs that investigated the efficacy of pneumococcal vaccines against ACP and PP for adult population. We searched PubMed for English language articles published between 1 January 1977 and 30 March 2017, with the terms ‘Streptococcus pneumoniae’, ‘pneumococcus’, ‘pneumococcal’, ‘vaccine’, ‘efficacy’, ‘trial’, and ‘adult’. We also reviewed relevant articles identified in previous systematic reviews [Reference Moberley1–Reference Diao3]. Studies were included if they were RCTs (either they used placebo, other vaccines, or no vaccine as controls), measured both ACP and PP as outcomes and fulfilled the assumption A3; otherwise, they were excluded. Data were extracted from published results. The median values and 95% credible intervals (CIs) for ve a, ve p, Se and Sp min were estimated based on non-informative priors using WinBUGS 1.4.3 (Medical Research Council and Imperial College London, UK) [Reference Lunn, Thomas and Spiegelhalter15], a statistical software package designed for Bayesian analysis. For the Markov Chain Monte Carlo procedures, we took 50 000 iterations with 20 000 for burn-in.
We identified seven RCTs that investigated the efficacy of pneumococcal vaccines against ACP and PP for the adult population. Two RCTs were excluded because one did not fulfill the assumption A3 (ve a <ve p <0 in the study) [Reference Simberkoff16] and one did not include a sufficient number of PP events (two in vaccinated group and one in the placebo group) [Reference Izumi17]. Finally, five RCTs including one 14-valent PPV trial, three PPV23 trials and one PCV13 trial were included in our analysis.
Characteristics of included RCTs are shown in Table 1. All but one PPV23 trial conducted by Örtqvist et al. [Reference Ortqvist18] showed positive VE results. All RCTs used different combinations of diagnostic tests for PP. Four PPV trials used respiratory specimen culture, while the PCV13 trial [Reference Bonten6] used a newly developed serotype-specific urinary antigen detection (UAD) assay [Reference Pride19]. Only one trial by Örtqvist et al. used serological assays to detect antibodies against pneumolysin [Reference Jalonen20, Reference Leinonen21]. Estimated Se and Sp min values of composite diagnostic standards substantially varied by trials: 48.8% to 98.1% and 69.0% to 97.3%, respectively. The highest Se value was observed in the PCV13 trial, while the lowest Se and Sp min values were observed in the PPV23 trial by Örtqvist et al.
PP, pneumococcal pneumonia; ACP, all-cause pneumonia; ve a, observed vaccine efficacy against ACP; ve p, observed vaccine efficacy against PP; PPV14, 14-valent pneumococcal polysaccharide vaccine; PPV23, 23-valent pneumococcal polysaccharide vaccine; PCV13, 13-valent pneumococcal conjugate vaccine; ICT, immunochromatographic test; CI, credible interval.
a As the estimated sensitivity is given as a risk difference ratio in our formula, its value can exceed 100%. When the assumption A3 holds, the median value of estimated sensitivity does not exceed 100%; however, its 95% credible interval may still include 100%.
In this study, we demonstrated that: (1) different combinations of diagnostic tests have been used to measure PP in pneumococcal vaccine trials; and (2) the estimated accuracy of composite diagnostic standards substantially varied by trials. The use of inaccurate diagnostic test underestimates true VEs; less specific tests more largely affect VE estimates than less sensitive tests [Reference Lachenbruch13, Reference Orenstein14]. Our findings indicate that pneumococcal VE estimates against PP are not directly comparable between RCTs.
Recent meta-analyses for PPV23 efficacy against PP in older adults showed inconsistent findings [Reference Diao3–Reference Falkenhorst5]. Although two meta-analyses showed a non-significant protective trend [Reference Diao3, Reference Schiffner-Rohe4], a meta-analysis by Falkenhorst et al. demonstrated a significant PPV23 efficacy against PP excluding trials which had used serological assays [Reference Falkenhorst5]. The serological assays for PP had been developed in the early 1990s [Reference Jalonen20, Reference Leinonen21] and used in epidemiological studies. However, their inaccuracy has been demonstrated in later validation studies [Reference Falkenhorst5, Reference Scott, Hall and Leinonen22] and the assays are rarely used recently. In the trial by Örtqvist et al., most PP cases had been diagnosed by the assays. In fact, among five RCTs included in our study, the lowest Se and Sp min values were observed in their study. The inclusion of this study in meta-analyses must cause biased pooled-VE estimates.
On the other hand, high Se and Sp min values were observed in the PCV13 trial. The majority of PP cases in this trial were diagnosed by the serotype-specific UAD and a validation study demonstrated that its sensitivity and specificity for the diagnosis of invasive pneumococcal disease are 98% and 100%, respectively [Reference Huijts23]. These findings suggest that the PCV13 efficacy estimates are uncomparable with the PPV efficacy estimates which had been measured by less accurate diagnostic tests.
The lack of standardized pneumonia outcome is a major limitation in pneumococcal vaccine trials [Reference Jokinen7]. Although current pneumococcal vaccines do not cover all pneumococcal serotypes, few trials have measured vaccine-type PP [Reference Suzuki24]; instead, almost all previous trials have measured less specific outcomes such as ACP and PP using different definitions [Reference Moberley1–Reference Falkenhorst5]. ACP includes a variety of aetiology other than pneumococcus such as Haemophilus influenzae and viruses [Reference Morimoto25] and PP includes a substantial proportion of non-vaccine-type PP. The inclusion of these vaccine-unrelated pneumonia decreases the specificity of outcome and underestimates the VEs. In the current study, as long as the assumption A2 holds, the proportion of non-PP in ACP does not affect our estimated accuracy of outcome for PP. If the risk of non-PP increases in the vaccinated individuals due to a replacement, our method overestimates the true accuracy; if the risk decreases in the vaccinated individuals due to a cross-protection, our method underestimates the true accuracy. However, such effects have not been observed in previous studies including our recent vaccine effectiveness study [Reference Suzuki24]. On the other hand, to apply our method for estimating the accuracy of outcome for vaccine-type PP, an additional assumption of zero-efficacy against non-vaccine-type PP is required. This assumption may not hold in real settings because of the serotype replacement induced by the PCVs [Reference Hausdorff and Hanage26]. Another limitation of the use of the composite diagnostic standard in pneumococcal vaccine trials is that not all samples among ACP cases are always tested for pneumococcus. This missing test results may decrease the overall sensitivity and specificity of outcome as reflected in our estimates.
In this study, we proposed and applied a new method to estimate the accuracy of composite diagnostic standards for PP used in pneumococcal vaccine trials. The latent class analysis (LCA) has been used to estimate the sensitivity and specificity of individual tests for diagnosing PP in the absence of gold standard [Reference Butler27–Reference Blake29]. The LCA estimates the accuracy of each test based on the observed frequency of the possible combinations of test results. The advantage of our method is its ability to assess the accuracy of outcomes measured by multiple diagnostic tests without using individual test results. Although several assumptions are required, our method may be also useful for evaluating pneumonia outcomes used in pediatric PCV trials [Reference Madhi and Klugman30].
Our study has limitations. We assumed that the sensitivity and specificity for PP are identical between vaccinated and unvaccinated groups, although there is no evidence to support this assumption. If the proportion of the tested samples among ACP cases is different between the vaccinated and unvaccinated groups, our estimates may be biased; however, we can reasonably assume that the probability of testing is almost identical between the two groups in RCTs. Additionally, systematic and random errors in the trial may affect our estimates. The observed difference in our accuracy estimates by RCTs may be partially explained by the different population characteristics (eg. general population [Reference Bonten6, Reference Riley31] vs. high-risk population [Reference Ortqvist18, Reference Alfageme32, Reference Maruyama33]). Notably, the trials with highest PP incidence (the trials by Örtqvist et al. [Reference Ortqvist18] and Maruyama et al. [Reference Maruyama33]) were those with the lowest sensitivity. Other factors than just the outcome definition must affect the estimates. Finally, only the minimum value of specificity can be estimated in this approach.
In conclusion, the accuracy of composite diagnostic standards for PP varies by RCTs because of the use of different combinations of imperfect tests. Without standardizing the outcome measurement, pneumococcal VE estimates are uncomparable and their pooled estimates are biased.
The supplementary material for this article can be found at https://doi.org/10.1017/S0950268818000651
This work was supported by Nagasaki University.
Declaration of interest
K.A. reports speaker fees from Eli Lilly, Takeda and Asahi Kasei Pharma. K.M. reports speaker fees from Taisho Toyama Pharmaceutical, Pfizer and Asahi Kasei Pharma. All other authors declare no competing interests.