Skip to main content Accessibility help

STUDY QUALITY IN SLA: An Assessment of Designs, Analyses, and Reporting Practices in Quantitative L2 Research

  • Luke Plonsky (a1)


This study assesses research and reporting practices in quantitative second language (L2) research. A sample of 606 primary studies, published from 1990 to 2010 in Language Learning and Studies in Second Language Acquisition, was collected and coded for designs, statistical analyses, reporting practices, and outcomes (i.e., effect sizes). The results point to several systematic strengths as well as many flaws, such as a lack of control in experimental designs, incomplete and inconsistent reporting practices, and low statistical power. I discuss these trends, strengths, and weaknesses in comparison with methodological reviews of L2 research (e.g., Plonsky & Gass, 2011) as well as reviews from other fields (e.g., education, Skidmore & Thompson, 2010). On the basis of the findings, I also make a number of suggestions for methodological reforms in applied linguistics.


Corresponding author

*Correspondence concerning this article should be addressed to Luke Plonsky, PO Box 6032, Flagstaff, AZ 86011. E-mail:


Hide All
Aguinis, H., Pierce, C. A., Bosco, F. A., & Muslin, I. S. (2009). First decade of organizational research methods trends in design, measurement, and data-analysis topics. Organizational Research Methods, 12, 69112.
American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: Author.
Bangert, A. W., & Baumberger, J. P. (2005). Research and statistical techniques used in the Journal of Counseling & Development. Journal of Counseling & Development , 83, 480487.
Brutus, S., Gill, H., & Duniewicz, K. (2010). State-of-science in industrial and organizational psychology: A review of self-reported limitations. Personnel Psychology, 63, 907936.
Campbell, D., & Stanley, J. (1963). Experimental and quasi-experimental designs for research. Chicago: Rand-McNally.
Cashen, L. H., & Geiger, S. W. (2004). Statistical power and the testing of null hypotheses: A review of contemporary management research and recommendations for future studies. Organizational Research Methods, 7, 151167.
Chan, A.-W., Hróbjartsson, A., Haahr, M. T., Gøtzsche, P. C., & Altman, D. G. (2004). Empirical evidence for selective reporting of outcomes in randomized trials. Journal of the American Medical Association, 291, 24572465.
Chaudron, C. (1986). The interaction of quantitative and qualitative approaches to research: A view of the second language classroom. TESOL Quarterly, 20, 709717.
Chaudron, C. (2001). Progress in language classroom research: Evidence from The Modern Language Journal, 1916–2000. Modern Language Journal, 85, 5776.
Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426443.
Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 971003.
Crookes, G. (1991). Power, effect size, and second language research: Another researcher comments. TESOL Quarterly, 25, 762765.
DeKeyser, R., & Schoonen, R. (2007). Editors’ announcement. Language Learning, 57, ixx.
DeVaney, T. A. (2001). Statistical significance, effect size, and replication: What do the journals say? The Journal of Experimental Education, 69, 310320.
Dinsmore, T. H. (2006). Principles, parameters, and SLA: A retrospective meta-analytic investigation into adult L2 learners’ access to Universal Grammar. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 5390). Amsterdam: Benjamins.
Downs, S. H., & Black, N. (1998). The feasibility of creating a checklist for the assessment of the methodological quality both of randomized and nonrandomized studies of health care interventions. Journal of Epidemiology & Community Health, 52, 377384.
Egbert, J. (2007). Quality analysis of journals in TESOL and applied linguistics. TESOL Quarterly, 41, 157171.
Ellis, N. C. (2000). Editorial statement. Language Learning, 50, xixiii.
Fish, L. J. (1988). Why multivariate methods are usually vital. Measurement and Evaluation in Counseling and Development, 21, 130137.
Flahive, D., & Ehlers-Zavala, F. (2010, March). Power analysis in applied linguistics research. Paper presented at the meeting of the American Association for Applied Linguistics, Atlanta, GA.
Fleiss, J. L. (1971). Measuring nominal scale agreement among many raters. Psychological Bulletin, 76, 378382.
Gass, S. M. (1993). Second language acquisition: Cross-disciplinary perspectives. Second Language Research, 9, 9598.
Gass, S. (2009). A survey of SLA research. In Ritchie, W. & Bhatia, T. (Eds.), Handbook of second language acquisition (pp. 328). Bingley, UK: Emerald.
Gass, S., Fleck, C., Leder, N., & Svetics, I. (1998). Ahistoricity revisited: Does SLA have a history? Studies in Second Language Acquisition, 20, 407421.
Gelman, A., Hill, J., & Yajima, M. (2012). Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness, 5, 189211.
Gelman, A., & Weakliem, D. (2009). Of beauty, sex and power: Too little attention has been paid to the statistical challenges in estimating small effects. American Scientist, 97, 310316.
Goodwin, L. D., & Goodwin, W. L. (1985). An analysis of statistical techniques used in the Journal of Educational Psychology, 1979–1983. Educational Psychologist, 20, 1321.
Hatch, E. (1978). Apply with caution. Studies in Second Language Acquisition, 2, 123143.
Hatch, E., & Lazaraton, A. (1991). The research manual: Design and statistics for applied linguistics. Boston: Heinle & Heinle.
Hauser, E. (2001, October). The statistical power of second language acquisition research: A review. Paper presented at the Pacific Second Language Research Forum, University of Hawai‘i at Mānoa.
Henning, G. (1986). Quantitative methods in language acquisition research. TESOL Quarterly, 20, 701708.
Humphreys, L. G. (1978). Doing research the hard way: Substituting analysis of variance for a problem in correlational analysis. Journal of Educational Psychology, 70, 873876.
Journal Article Reporting Standards Working Group. (2008). Reporting standards for research in psychology: Why do we need them? What might they be? American Psychologist, 63, 839851.
Keselman, H. J., Huberty, C. J., Lix, L. M., Olejnik, S., Cribbie, R. A., Donahue, B., . . . Levin, J. R. (1998). Statistical practices of educational researchers: An analysis of their ANOVA, MANOVA, and ANCOVA analyses. Review of Educational Research, 68, 350386.
Kieffer, K. M., Reese, R. J., & Thompson, B. (2001). Statistical techniques employed in AERJ and JCP articles from 1988 to 1997: A methodological review. The Journal of Experimental Education, 69, 280309.
Kubanyiova, M. (2008). Rethinking research ethics in contemporary applied linguistics: The tension between macroethical and microethical perspectives in situated research. Modern Language Journal, 92, 503518.
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33, 159174.
Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. London: Routledge.
Larson-Hall, J., & Herrington, R. (2010). Improving data analysis in second language acquisition by utilizing modern developments in applied statistics. Applied Linguistics, 31, 368–190.
Lazaraton, A. (1991). Power, effect size, and second language research: A researcher comments. TESOL Quarterly, 25, 759762.
Lazaraton, A. (2000). Current trends in research methodology and statistics in applied linguistics. TESOL Quarterly, 34, 175181.
Lazaraton, A. (2005). Quantitative research methods. In Hinkel, E. (Ed.), Handbook of research in second language teaching and learning (pp. 109224). Mahwah, NJ: Erlbaum.
Lazaraton, A., Riggenbach, H., & Ediger, A. (1987). Forming a discipline: Applied linguists’ literacy in research methodology and statistics. TESOL Quarterly, 21, 263277.
Lee, J. (2010). Integrating second language empirical evidence in theory construction: Unaccusativity as a dichotomy versus a continuum. Eoneohag, 56, 6786.
Li, S. (2010). The effectiveness of corrective feedback in SLA: A meta-analysis. Language Learning, 60, 309365.
Lightbown, P. M. (2000). Anniversary article: Classroom second language research and second language teaching. Applied Linguistics, 21, 431462.
Loewen, S. (2005). Incidental focus on form and second language learning. Studies in Second Language Acquisition, 27, 361386.
Loewen, S., & Gass, S. (2009). The use of statistics in L2 acquisition research. Language Teaching, 42, 181196.
Loewen, S., Lavolette, E., Spino, L., Papi, M., Schmidtke, J., Sterling, S., & Wolff, D. (in press). A discipline formed?: An update on applied linguists’ statistical literacy. TESOL Quarterly.
Lykken, D. E. (1968). Statistical significance in psychological research. Psychological Bulletin, 70, 151159.
Lyster, R., & Izquierdo, J. (2009). Prompts versus recasts in dyadic interaction. Language Learning, 59, 453498.
Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. Mahwah, NJ: Erlbaum.
Mackey, A., & Gass, S. M. (Eds.). (2012). Research methods in second language acquisition: A practical guide. Oxford: Wiley-Blackwell.
Mackey, A., & Goo, J. (2007). Interaction research in SLA: A meta-analysis and research synthesis. In Mackey, A. (Ed.), Conversational interaction in second language acquisition: A collection of empirical studies (pp. 407449). Oxford: Oxford University Press.
Magnan, S. S. (1994). From the editor: The MLJ tradition and the challenges ahead. Modern Language Journal, 78, 79.
Magnan, S. S. (2007). Commentary: The promise of digital scholarship in SLA research and language pedagogy. Language Learning & Technology, 11, 152155.
Matrixx Initiatives Inc. Siracusano, v.. No. 09–1156 (9th Cir. Mar. 22, 2011).
Matthews, M. S., Gentry, M., McCoach, D. B., Worrell, F. C., Matthews, D., & Dixon, F. (2008). Evaluating the state of a field: Effect size reporting in gifted education. The Journal of Experimental Education, 77, 5565.
Meier, S. T., & Davis, S. R. (1990). Trends in reporting psychometric properties of scales used in counseling psychology research. Journal of Counseling Psychology, 37, 113115.
Mone, M. A., Mueller, G. C., & Mauland, W. (1996). The perceptions and usage of statistical power in applied psychology and management research. Personnel Psychology, 49, 103120.
Nassaji, H. (2012). Significance tests and generalizability of research results: A case for replication. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 92115). New York: Cambridge University Press.
Nekrasova, T., & Becker, T. (2009). Effectiveness of practice: A research synthesis and quantitative meta-analysis. Manuscript in preparation.
Nicoladis, E., & Krott, A. (2007). Word family size and French-speaking children’s segmentation of existing compounds. Language Learning, 57, 201228.
Norris, J. M., & Ortega, L. (2000). Effectiveness of L2 instruction: A research synthesis and quantitative meta-analysis. Language Learning, 50, 417528.
Norris, J. M., & Ortega, L. (2003). Defining and measuring SLA. In Doughty, C. J. & Long, M. H. (Eds.), The handbook of second language acquisition (pp. 717761). Oxford: Blackwell.
Norris, J. M., & Ortega, L. (2006). The value and practice of research synthesis for language learning and teaching. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 350). Amsterdam: Benjamins.
Norris, J. M., & Ortega, L. (2012). Assessing learner knowledge. In Gass, S. M. & Mackey, A. (Eds.), The Routledge handbook of second language acquisition (pp. 573589). London: Routledge.
Nunan, D. (1991). Methods in second language classroom-oriented research: A critical review. Studies in Second Language Acquisition, 13, 249274.
Nunan, D. (1996). Issues in second language acquisition research: Examining substance and procedure. In Ritchie, W. C. & Bhatia, T. K. (Eds.), The handbook of second language acquisition (pp. 349374). San Diego, CA: Academic Press.
Ortega, L. (2005). Methodology, epistemology, and ethics in instructed SLA research: An introduction. Modern Language Journal, 89, 317327.
Ortega, L. (2009). Understanding second language acquisition. London: Hodder.
Ortega, L. (2012). Language acquisition research for language teaching: Choosing between application and relevance. In Hinger, B., Newby, D., & Unterrainer, E. M. (Eds.), Sprachen lernen: Kompetenzen entwickeln? Performanzen (über)prüfen [Language learning: Developing competency? (Re)assessing performances] (pp. 2438). Vienna: Präsens Verlag.
Oswald, F. L., & Plonsky, L. (2010). Meta-analysis in second language research: Choices and challenges. Annual Review of Applied Linguistics, 30, 85110
Pica, T. (1997). Second language teaching and research relationships: A North American view. Language Teaching Research, 1, 4872.
Pigott, T. D. (2009). Handling missing data. In Cooper, H., Hedges, L. V., & Valentine, J. C. (Eds.), The handbook of research synthesis (2nd ed., pp. 399416). New York: Russell Sage Foundation.
Plonsky, L. (2009, October). “Nix the null”: Why statistical significance is overrated. Paper presented at the Second Language Research Forum, East Lansing, MI.
Plonsky, L. (2011a). The effectiveness of second language strategy instruction: A meta-analysis. Language Learning, 61, 9931038.
Plonsky, L. (2011b). Study quality in SLA: A cumulative and developmental assessment of designs, analyses, reporting practices, and outcomes in quantitative L2 research (Unpublished doctoral dissertation). Michigan State University, East Lansing.
Plonsky, L. (2012). Replication, meta-analysis, and generalizability. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 116132). New York: Cambridge University Press.
Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: The case of interaction research. Language Learning, 61, 325366.
Plonsky, L., & Oswald, F. L. (2012). How to do a meta-analysis. In Mackey, A. & Gass, S. (Eds.), Research methods in second language acquisition: A practical guide (pp. 275295). Oxford: Wiley-Blackwell.
Polio, C. (1997). Measures of linguistic accuracy in second language writing research. Language Learning, 47, 101143.
Polio, C. (2012). Replication in published applied linguistics research: An historical perspective. In Porte, G. (Ed.), Replication research in applied linguistics (pp. 4791). New York: Cambridge University Press.
Porte, G. (2010) Appraising research in second language learning: A practical approach to critical analysis of quantitative research (2nd ed.). Amsterdam: Benjamins.
Pulido, D. (2004). The relationship between text comprehension and second language incidental vocabulary acquisition: A matter of topic familiarity? Language Learning, 54, 469523.
Raykov, T., & Marcoulides, G. A. (2008). An introduction to applied multivariate analysis. New York: Taylor & Francis.
Read, J. (2007). Towards a new collaboration: Research in SLA and language testing. New Zealand Studies in Applied Linguistics, 13, 2235.
Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar: A meta-analysis of the research. In Norris, J. M. & Ortega, L. (Eds.), Synthesizing research on language learning and teaching (pp. 133164). Amsterdam: Benjamins.
Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training researchers. Psychological Methods, 1, 115129.
Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of statistical power have an effect on the power of studies? Psychological Bulletin, 105, 309316.
Selinker, L., & Lakshmanan, U. (2001). How do we know what we know? Why do we believe what we believe? Second Language Research, 17, 323325.
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.
Skidmore, S. T., & Thompson, B. (2010). Statistical techniques used in published articles: A historical review of reviews. Educational and Psychological Measurement, 70, 777795.
Smith, B., & Lafford, B. A. (2009). The evaluation of scholarly activity in computer-assisted language learning. Modern Language Journal, 93, 868883.
Sun, S., Pan, W., & Wang, L. L. (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102, 9891004.
Teleni, V., & Baldauf, R. B. (1989). Statistical techniques used in three applied linguistics journals: Language Learning, Applied Linguistics, and TESOL Quarterly, 1980–1986: Implications for readers and researchers. Retrieved from ERIC database. (ED312905).
Thompson, B. (2001). Significance, effect sizes, stepwise methods, and other issues: Strong arguments move the field. The Journal of Experimental Education, 70, 8093.
Thompson, B., & Snyder, P. A. (1998). Statistical significance and reliability analyses in recent JCD research articles. Journal of Counseling and Development, 76, 436441.
Vacha-Haase, T., Ness, C., Nilsson, J., & Reetz, D. (1999). Practices regarding reporting of reliability coefficients: A review of three journals. The Journal of Experimental Education, 67, 335341.
Vacha-Haase, T., & Thompson, B. (2004). How to estimate and interpret various effect sizes. Journal of Counseling Psychology, 51, 473481.
Valdman, A. (1998). A note from the editor: 20th anniversary of SSLA. Studies in Second Language Acquisition, 20, 463470.
Valentine, J. C., & Cooper, H. (2008). A systematic and transparent approach for assessing the methodological quality of intervention effectiveness research: The study design and implementation assessment device (Study DIAD). Psychological Methods, 13, 130149.
VanPatten, B., & Williams, J. (2002). Research criteria for tenure in second language acquisition: Results from a survey of the field. Unpublished manuscript, University of Illinois at Chicago.
Wa-Mbaleka, S. (2006). A meta-analysis investigating the effects of reading on second language vocabulary learning (Unpublished doctoral dissertation). Northern Arizona University, Flagstaff.
Waring, H. Z. (2009). Moving out of IRF (initiation-response-feedback): A single case analysis. Language Learning, 59, 796824.
Wells, C. S., & Hintze, J. M. (2007). Dealing with assumptions underlying statistical tests. Psychology in the Schools, 44, 495502.
Wells, K., & Littell, J. H. (2009). Study quality assessment in systematic reviews of research on intervention effects. Research on Social Work Practice, 19, 5262.
Wilkinson, L., & Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594604.
Willson, V. L. (1980). Research techniques in AERJ articles: 1969 to 1978. Educational Researcher, 9, 510.


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed