Dr Mitchell raises two interesting issues: (1) the way we calculated results of the screening procedure, in particular the number needed to screen to treat one additional patient; and (2) the PHQ–9 as a screening instrument.
As Dr Mitchell mentions, our study was indeed a screening implementation study. We wanted to learn whether screening in high-risk groups could detect a substantial number of so far not detected and treatable patients with depression. For this we conducted a pragmatic study and determined the gain of a stepped care screening (and treatment) programme in real practice, with real doctors and real patients. A GP who wants to screen his patients can read what to expect. Failures (refusals, no-shows, misclassifications) are inherent to such conditions and should be incorporated in the calculations.
We defined our target population and included patients (n = 2005) from the GPs' medical files and surgery. Our screening cascade showed 17 new patients that could be treated for depression. Perhaps this calculation is a bit optimistic because treatment was directly available without costs, which is not always the case.
Dr Mitchell makes a comparison with a drug trial. Unfortunately, we do not think this comparison makes the interpretation of our data easier. We consider as our screen the PHQ and use the SCID as the reference diagnostic standard. So the 780 patients who returned the PHQ and gave informed consent form the screened population. From there we count downwards to the number of detected cases and upwards to the number needed to be invited for the screening to be able to screen those 780 patients.
There can be discussion about the way we corrected for patients that did not adhere to the programme. We presented each step (with number of people who refused, did not attend and the reasons therefore), so that readers can make their own judgements, as Dr Mitchell has done. However, we disagree with his interpretation. If we use his analogy of a drug trial, then an intention-to-treat analysis is the best analysis. That means that non-adherence should be incorporated in the number needed to treat (or screen). Starting the analysis with the number of patients that completed the SCID (as Dr Mitchell does) provides the GPs with the number of patients they have to see, after a pre-screen with the PHQ–9.
It is correct that the PHQ–9 misses some cases, although not as many as Dr Mitchell supposes. We used the PHQ–9 in the screening mode (a cut off score of 10) and not in the diagnostic mode (algorithm). Sensitivity of the screening mode is 0.93, not 0.77. 1 However, a GP who uses the PHQ–9 will follow the results of the screening and invite patients with a positive score for clinical evaluation, thus also missing patients with a score below threshold. We unfortunately do not have SCID data of those who scored negative on the PHQ.