Effect size calculation needs to be specified with details: comment on Ying et al.

Yan Luo; Toshi A. Furukawa

doi:10.1017/S0033291722002057

Effect size calculation needs to be specified with details: comment on Ying et al.

Published online by Cambridge University Press: 08 July 2022

Yan Luo

and

Toshi A. Furukawa

Show author details

Yan Luo*: Affiliation:
Department of Health Promotion and Human Behavior, School of Public Health in the Graduate School of Medicine, Kyoto University, Kyoto, Japan
Toshi A. Furukawa: Affiliation:
Department of Health Promotion and Human Behavior, School of Public Health in the Graduate School of Medicine, Kyoto University, Kyoto, Japan
*: Author for correspondence: Yan Luo, E-mail: lilacluo@gmail.com, luo.yan.2u@kyoto-u.ac.jp

Article contents

Abstract
To the editor
Conflict of interest
References

Rights & Permissions

Abstract

An abstract is not available for this content. As you have access to this content, full HTML content is provided on this page. A PDF of this content is also available in through the ‘Save PDF’ action button.

Keywords

Cognitive-behavioral therapy depression effect size result interpretation standardized mean difference

Type: Correspondence
Information: Psychological Medicine , Volume 53 , Issue 9 , July 2023 , pp. 4300 - 4301

DOI: https://doi.org/10.1017/S0033291722002057 [Opens in a new window]
Copyright: Copyright © The Author(s), 2022. Published by Cambridge University Press

To the editor

We read with great interest the article by Ying et al. (Ying et al. Reference Ying, Ji, Kong, Wang, Chen, Wang and Ruan2022), who reported on a well-conducted randomized controlled trial of cognitive-behavioral therapy (CBT) in alleviating depressive symptoms among Chinese patients with subthreshold depression. The results indicated that the internet-based CBT (ICBT) was significantly superior not only to the waiting list but also to face-to-face CBT. In interpreting the results from clinical trials, effect sizes are critically informative. However, we have some concerns about the effect sizes reported in this study.

In general, the standardized mean difference (s.m.d.) is widely used in clinical trials when the outcomes are continuous. The s.m.d. is standardized by dividing the mean difference (m.d.) by the standard deviation (s.d.), and allows comparison between studies which use different measuring instruments. However, there are various methods to calculate the s.m.d.: the s.m.d. can be calculated from different m.d.s (e.g. m.d. of endpoint scores, m.d. of change scores from baseline, m.d. from a model where the baseline score is adjusted) and s.d.s (e.g. pooled s.d. of endpoint scores, pooled s.d. of change scores, pooled s.d. of baseline scores, or the s.d. converted from model statistics). The s.m.d.s estimated by these methods are substantially different from one another, raising potential problems regarding reproducibility, selective reporting, and proper interpretation of how large the effect is (Luo et al., Reference Luo, Funada, Yoshida, Noma, Sahker and Furukawa2022). For instance, without prespecifying the calculation method, researchers may compute s.m.d.s using different methods, and select the largest one to report. Additionally, Cohen's rule of thumb, often used as a reference to interpret the effect size in clinical research, could be hard to apply if different calculation methods produce different s.m.d. values for the same outcome of the study.

In Ying et al.'s study, the method to calculate the effect size was explained in the Method section as, ‘Within-group and between-group effect size (Cohen's d) were based on the method suggested for mixed model analysis (Feingold, Reference Feingold2009; Morris & DeShon, Reference Morris and DeShon2002; Thorsell et al., Reference Thorsell, Finnes, Dahl, Lundgren, Gybrant, Gordh and Buhrman2011)’. However, it is still unclear which m.d. and which s.d. they used to calculate the effect sizes. Both Feingold's and Morris & DeShon's papers suggest using the pooled baseline s.d. for the m.d. estimated from a mixed model (Feingold, Reference Feingold2009; Morris & DeShon, Reference Morris and DeShon2002). On the other hand, in Thorsell et al.'s study, the square root of the variance estimate from the mixed model was used to calculate the effect size (Thorsell et al., Reference Thorsell, Finnes, Dahl, Lundgren, Gybrant, Gordh and Buhrman2011).

Ying et al.'s reported s.m.d.s do not seem to follow these methods. Let's take for example the between-group effect size of CES-D at post-intervention for ICBT v. face-to-face CBT, which was reported to be 0.06 (95% confidence interval: 0.02–0.09) in their Table 4. Using the values reported in Table 3 (the m.d. at endpoint was 1.6, the pooled baseline s.d. was 3.75, and the pooled endpoint s.d. was 3.80), the s.m.d. would be calculated as 0.43 using the baseline s.d. or 0.42 using the endpoint s.d.. The reported s.m.d. could have been calculated from other s.d.s that were not reported in the paper, but for an m.d. of 1.6 to generate an s.m.d. of 0.06, the s.d. would need to be approximately 27. It would be very difficult to imagine a population that has such a large variability in CES-D scores whose score ranges between 0–60. There are similar discrepancies for the other between-group effect sizes reported in their Table 4.

Because the s.m.d. values that we calculated and that was reported by the authors were very different, the interpretation of how large the effect of the intervention was could have substantively different clinical interpretations. The authors stated in the article that Cohen's rule of thumb was used as an aid for interpretation. Applying this rule, an s.m.d. of 0.4 would be moderate, while an s.m.d. of 0.06 would be less than a small effect. We are wondering how the effect of ICBT and CBT should be properly interpreted, and whether the effect sizes estimated in this article could be appropriately compared to previous studies, which might have used different s.m.d. calculation methods.

In summary, because the s.m.d.s can be calculated by different m.d.s and s.d.s and these s.m.d. estimates can vary substantially, researchers should be careful in reporting them and readers should be mindful how the reported s.m.d.s were calculated. As it is still hard to recommend a single calculation method that should be used universally for now (Luo et al., Reference Luo, Funada, Yoshida, Noma, Sahker and Furukawa2022), it is desirable for researchers to report their calculation methods in detail to increase transparency and reproducibility.

Prespecifying the method beforehand may help to avoid selective reporting bias. Meanwhile, future methodological studies are warranted that elucidate which s.m.d. calculation methods are recommendable.

Conflict of interest

None.

References

Feingold, A. (2009). Effect sizes for growth-modeling analysis for controlled clinical trials in the same metric as for classical analysis. Psychological Methods, 14(1), 43–53. doi: 10.1037/a0014699CrossRef Google Scholar PubMed

Luo, Y., Funada, S., Yoshida, K., Noma, H., Sahker, E., & Furukawa, T. A. (2022). Large variation existed in standardized mean difference estimates using different calculation methods in clinical trials. Journal of Clinical Epidemiology, 149, 89–97. doi: 10.1016/j.jclinepi.2022.05.023CrossRef Google Scholar PubMed

Morris, S. B., & DeShon, R. P. (2002). Combining effect size estimates in meta-analysis with repeated measures and independent-groups designs. Psychological Methods, 7(1), 105–125. doi: 10.1037/1082-989x.7.1.105CrossRef Google Scholar PubMed

Thorsell, J., Finnes, A., Dahl, J., Lundgren, T., Gybrant, M., Gordh, T., & Buhrman, M. (2011). A comparative study of 2 manual-based self-help interventions, acceptance and commitment therapy and applied relaxation, for persons with chronic pain. The Clinical Journal of Pain, 27(8), 716–723. doi: 10.1097/AJP.0b013e318219a933CrossRef Google Scholar PubMed

Ying, Y., Ji, Y., Kong, F., Wang, M., Chen, Q., Wang, L., … Ruan, L. (2022). Efficacy of an internet-based cognitive behavioral therapy for subthreshold depression among Chinese adults: A randomized controlled trial. Psychological Medicine, 1–11. doi: 10.1017/S0033291722000599Google Scholar PubMed

Article contents

Effect size calculation needs to be specified with details: comment on Ying et al.

Abstract

Keywords

To the editor

Conflict of interest

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests