Skip to main content Accessibility help

Test–Retest Reliability of Concussion Baseline Assessments in United States Service Academy Cadets: A Report from the National Collegiate Athletic Association (NCAA)–Department of Defense (DoD) CARE Consortium

  • Megan N. Houston (a1), Kathryn L. Van Pelt (a2), Christopher D’Lauro (a3), Rachel M. Brodeur (a4), Darren E. Campbell (a5), Gerald T. McGinty (a3), Jonathan C. Jackson (a3), Tim F. Kelly (a6), Karen Y. Peck (a6), Steven J. Svoboda (a7), Thomas W. McAllister (a8), Michael A. McCrea (a9), Steven P. Broglio (a10) and Kenneth L. Cameron (a1)...



In response to advancing clinical practice guidelines regarding concussion management, service members, like athletes, complete a baseline assessment prior to participating in high-risk activities. While several studies have established test stability in athletes, no investigation to date has examined the stability of baseline assessment scores in military cadets. The objective of this study was to assess the test–retest reliability of a baseline concussion test battery in cadets at U.S. Service Academies.


All cadets participating in the Concussion Assessment, Research, and Education (CARE) Consortium investigation completed a standard baseline battery that included memory, balance, symptom, and neurocognitive assessments. Annual baseline testing was completed during the first 3 years of the study. A two-way mixed-model analysis of variance (intraclass correlation coefficent (ICC)3,1) and Kappa statistics were used to assess the stability of the metrics at 1-year and 2-year time intervals.


ICC values for the 1-year test interval ranged from 0.28 to 0.67 and from 0.15 to 0.57 for the 2-year interval. Kappa values ranged from 0.16 to 0.21 for the 1-year interval and from 0.29 to 0.31 for the 2-year test interval. Across all measures, the observed effects were small, ranging from 0.01 to 0.44.


This investigation noted less than optimal reliability for the most common concussion baseline assessments. While none of the assessments met or exceeded the accepted clinical threshold, the effect sizes were relatively small suggesting an overlap in performance from year-to-year. As such, baseline assessments beyond the initial evaluation in cadets are not essential but could aid concussion diagnosis.


Corresponding author

*Correspondence and reprint requests to: Megan N. Houston, John A. Feagin Jr. Sports Medicine Fellowship, Keller Army Hospital, 900 Washington Road, West Point, NY10996, USA. Tel: +1 845 938 6826. E-mail:


Hide All
Barr, W.B. & McCrea, M. (2001). Sensitivity and specificity of standardized neurocognitive testing immediately following sports concussion. Journal of the International Neuropsychological Society, 7(6), 693702.
Barth, J.T., Alves, W.M., Ryan, T.V., Macciocchi, S.N., Rimel, R.W., Jane, J.A., & Nelson, W.E. (1989). Mild head injury in sports: Neuropsychological sequelae and recovery of function. Mild Head Injury, 257275.
Bell, D.R., Guskiewicz, K.M., Clark, M.A., & Padua, D.A. (2011). Systematic review of the balance error scoring system. Sports Health, 3(3), 287295.
Broglio, S.P., Cantu, R.C., Gioia, G.A., Guskiewicz, K.M., Kutcher, J., Palm, M., & Valovich McLeod, T.C. (2014). National Athletic Trainers’ Association position statement: Management of sport concussion. Journal of Athletic Training, 49(2), 245265.
Broglio, S.P., Ferrara, M.S., Macciocchi, S.N., Baumgartner, T.A., & Elliott, R. (2007). Test-retest reliability of computerized concussion assessment programs. Journal of Athletic Training, 42(4), 509514.
Broglio, S.P., Ferrara, M.S., Sopiarz, K., & Kelly, M.S. (2008). Reliable change of the sensory organization test. Clinical Journal of Sport Medicine, 18(2), 148154.
Broglio, S.P., Harezlak, J., Katz, B., Zhao, S., McAllister, T., McCrea, M., & Investigators, C.C. (2019). Acute sport concussion assessment optimization: A prospective assessment from the CARE consortium. Sports Medicine, 49(12), 19771987.
Broglio, S.P., Katz, B.P., Zhao, S., McCrea, M., & McAllister, T. (2018). Test-retest reliability and interpretation of common concussion assessment tools: Findings from the NCAA-DoD CARE consortium. Sports Medicine, 48(5), 12551268.
Broglio, S.P., McCrea, M., McAllister, T., Harezlak, J., Katz, B., Hack, D., & Hainline, B. (2017). A national study on the effects of concussion in collegiate athletes and US Military Service Academy members: The NCAA-DoD concussion assessment, research and education (CARE) consortium structure and methods. Sports Medicine, 47(7), 14371451.
Broglio, S.P., Zhu, W., Sopiarz, K., & Park, Y. (2009). Generalizability theory analysis of balance error scoring system reliability in healthy young adults. Journal of Athletic Training, 44(5), 497502.
Cameron, K.L., Marshall, S.W., Sturdivant, R.X., & Lincoln, A.E. (2012). Trends in the incidence of physician-diagnosed mild traumatic brain injury among active duty U.S. military personnel between 1997 and 2007. J Neurotrauma, 29(7), 13131321.
Centers for Disease Control and Prevention. (2016). Basic information about traumatic brain injury and concussion. Retrieved from
Chan, M., Vielleuse, J.V., Vokaty, S., Wener, M.A., Pearson, I., & Gagnon, I. (2013). Test-retest reliability of the sport concussion assessment tool 2 (SCAT2) for uninjured children and young adults. British Journal of Sports Medicine, 47(5), e1.
Chin, E.Y., Nelson, L.D., Barr, W.B., McCrory, P., & McCrea, M.A. (2016). Reliability and validity of the Sport Concussion Assessment Tool-3 (SCAT3) in high school and collegiate athletes. The American Journal of Sports Medicine, 44(9), 22762285.
Cohen, J. (1977). Statistical power analysis for the behavioral sciences (Rev. ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Cole, W.R., Arrieux, J.P., Schwab, K., Ivins, B.J., Qashu, F.M., & Lewis, S.C. (2013). Test-retest reliability of four computerized neurocognitive assessment tools in an active duty military population. Archives of Clinical Neuropsychology, 28(7), 732742.
Defense and Veterans Brain Injury Center. (2018). Department of Defense worldwide numbers for traumatic brain injury. Retrieved from
Department of Defense. (2015). Department of Defense Instruction 6490.13: Comprehensive policy on traumatic brain injury-related neurocognitive assessments by the military services. Retrieved from
Farnsworth, J.L., Dargo, L., Ragan, B.G., & Kang, M. (2017). Reliability of computerized neurocognitive tests for concussion assessment: A meta-analysis. Journal of Athletic Training, 52(9), 826833.
Grilo, L., & Grilo, H. (2012). Comparison of clinical data based on limits of agreement. Biomedical Letters, 49(1), 4556.
Herring, S.A., Cantu, R.C., Guskiewicz, K.M., Putukian, M., Kibler, W.B., Bergfeld, J.A., … Indelicato, P.A. (2011). Concussion (mild traumatic brain injury) and the team physician: A consensus statement – 2011 update. Med Sci Sports Exerc, 43(12), 24122422.
Hinton-Bayre, A.D., Geffen, G.M., Geffen, L.B., McFarland, K.A., & Frijs, P. (1999). Concussion in contact sports: Reliable change indices of impairment and recovery. Journal of Clinical and Experimental Neuropsychology, 21(1), 7086.
Hoffman, J.I. (2015). Linear regression. In Hoffman, J.I. (Ed.), Biostatistics for medical and biomedical practitioners (pp. 451500). Amsterdam: Elsevier.
Iverson, G.L. (2011). Reliable change index. In Kreutzer, J.S., DeLuca, J., & Caplan, B. (Eds.), Encyclopedia of clinical neuropsychology (pp. 21502153). New York, NY: Springer.
Iverson, G.L., Lovell, M.R., & Collins, M.W. (2003). Interpreting change on ImPACT following sport concussion. The Clinical Neuropsychologist, 17(4), 460467.
Koo, T.K., & Li, M.Y. (2016). A guideline of selecting and reporting intraclass correlation coefficients for reliability research. Journal of Chiropractic Medicine, 15(2), 155163.
Lancaster, M.A., McCrea, M.A., & Nelson, L.D. (2016). Psychometric properties and normative data for the Brief Symptom Inventory-18 (BSI-18) in high school and collegiate athletes. The Clinical Neuropsychologist, 30(2), 338350.
Langlois, J.A., Rutland-Brown, W., & Wald, M.M. (2006). The epidemiology and impact of traumatic brain injury: A brief overview. Journal of Head Trauma Rehabilitation, 21(5), 375378.
Leahy, S. (2011, April 27). Peyton Manning admits to tanking NFL’s baseline concussion test. USA Today.
Lichtenstein, J.D., Moser, R.S., & Schatz, P. (2014). Age and test setting affect the prevalence of invalid baseline scores on neurocognitive tests. The American Journal of Sports Medicine, 42(2), 479484.
McCrea, M., Kelly, J.P., Randolph, C., Kluge, J., Bartolic, E., Finn, G., & Baxter, B. (1998). Standardized assessment of concussion (SAC): On-site mental status evaluation of the athlete. Journal of Head Trauma Rehabilitation, 13(2), 2735.
McCrory, P., Meeuwisse, W., Dvorak, J., Aubry, M., Bailes, J., Broglio, S., …Vos, P.E. (2017). Consensus statement on concussion in sport-the 5th international conference on concussion in sport held in Berlin, October 2016. British Journal of Sports Medicine, 51(11), 838847.
McCrory, P., Meeuwisse, W.H., Aubry, M., Cantu, B., Dvorak, J., Echemendia, R.J., Engebretsen, L., … Turner, M. (2013). Consensus statement on concussion in sport: The 4th International Conference on Concussion in Sport held in Zurich, November 2012. British Journal of Sports Medicine, 47(5), 250258.
McLeod, T.C., & Leach, C. (2012). Psychometric properties of self-report concussion scales and checklists. Journal of Athletic Training, 47(2), 221223.
Meachen, S.J., Hanks, R.A., Millis, S.R., & Rapport, L.J. (2008). The reliability and validity of the brief symptom inventory-18 in persons with traumatic brain injury. Archives of Physical Medicine and Rehabilitation, 89(5), 958965.
Mehta, S., Bastero-Caballero, R.F., Sun, Y., Zhu, R., Murphy, D.K., Hardas, B., & Koch, G. (2018). Performance of intraclass correlation coefficient (ICC) as a reliability index under various distributions in scale reliability studies. Statistics in Medicine, 37(18), 27342752.
Nelson, L.D., LaRoche, A.A., Pfaller, A.Y., Lerner, E.B., Hammeke, T.A., Randolph, C., … McCrea, M.A. (2016). Prospective, head-to-head study of three computerized neurocognitive assessment tools (CNTs): Reliability and validity for the assessment of sport-related concussion. Journal of the International Neuropsychological Society, 22(1), 2437.
Portney, L.G., & Watkins, M.P. (2009). Foundations of clinical research: Applications to practice. Upper Saddle River, NJ: Prentice Hall.
Ratcliff, R., Smith, P.L., Brown, S.D., & McKoon, G. (2016). Diffusion decision model: Current issues and history. Trends in Cognitive Sciences, 20(4), 260281.
Register-Mihalik, J.K., Guskiewicz, K.M., Mihalik, J.P., Schmidt, J.D., Kerr, Z.Y., & McCrea, M.A. (2013). Reliable change, sensitivity, and specificity of a multidimensional concussion assessment battery: Implications for caution in clinical practice. Journal of Head Trauma Rehabilitation, 28(4), 274283.
Resch, J., Driscoll, A., McCaffrey, N., Brown, C., Ferrara, M.S., Macciocchi, S., … & Walpert, K. (2013). ImPact test-retest reliability: Reliably unreliable? Journal of Athletic Training, 48(4), 506511.
Riemann, B.L., Guskiewicz, K.M., & Shields, E.W. (1999). Relationship between clinical and forceplate measures of postural stability. Journal of Sport Rehabilitation, 8(2), 7182.
Shrout, P.E., & Fleiss, J.L. (1979). Intraclass correlations: Uses in assessing rater reliability. Psychological Bulletin, 86(2), 420428.
Snell, F.I., & Halter, M.J. (2010). A signature wound of war: Mild traumatic brain injury. Journal of Psychosocial Nursing and Mental Health Services, 48(2), 2228.
Valovich McLeod, T.C., Barr, W.B., McCrea, M., & Guskiewicz, K.M. (2006). Psychometric and measurement properties of concussion assessment tools in youth sports. Journal of Athletic Training, 41(4), 399408.


Type Description Title
Supplementary materials

Houston et al. supplementary material
Houston et al. supplementary material

 Unknown (141 KB)
141 KB

Test–Retest Reliability of Concussion Baseline Assessments in United States Service Academy Cadets: A Report from the National Collegiate Athletic Association (NCAA)–Department of Defense (DoD) CARE Consortium

  • Megan N. Houston (a1), Kathryn L. Van Pelt (a2), Christopher D’Lauro (a3), Rachel M. Brodeur (a4), Darren E. Campbell (a5), Gerald T. McGinty (a3), Jonathan C. Jackson (a3), Tim F. Kelly (a6), Karen Y. Peck (a6), Steven J. Svoboda (a7), Thomas W. McAllister (a8), Michael A. McCrea (a9), Steven P. Broglio (a10) and Kenneth L. Cameron (a1)...


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.