Skip to main content Accessibility help
  • Print publication year: 2011
  • Online publication date: June 2012

1 - The Learning Sciences in Educational Assessment: An Introduction


Victor Hugo is credited with stating that “There is nothing more powerful than an idea whose time has come.” In educational achievement testing, a multi-billion-dollar activity with profound implications for individuals, governments, and countries, the idea whose time has come, it seems, is that large-scale achievement tests must be designed according to the science of human learning. Why this idea, and why now? To begin to set a context for this idea and this question, a litany of research studies and public policy reports can be cited to make the simple point that students in the United States and abroad are performing relatively poorly in relation to expected standards and projected economic growth requirements (e.g., American Association for the Advancement of Science, 1993; Chen, Gorin, Thompson, & Tatsuoka, 2008; Grigg, Lauko, & Brockway, 2006; Hanushek, 2003, 2009; Kilpatrick, Swafford, & Findell, 2001; Kirsch, Braun, & Yamamoto, 2007; Manski & Wise, 1983; Murnane, Willet, Dulhaldeborde, & Tyler, 2000; National Commission on Excellence in Education, 1983; National Mathematics Advisory Panel, 2008; National Research Council, 2005, 2007, 2009; Newcombe et al., 2009; Phillips, 2007; Provasnik, Gonzales, & Miller, 2009). According to a 2007 article in the New York Times, Gary Phillips, chief scientist at the American Institutes for Research, was quoted as saying, “our Asian economic competitors are winning the race to prepare students in math and science.”

Related content

Powered by UNSILO
Ainsworth, S. & Loizou, A. (2003). The effects of self-explaining when learning with text or diagrams. Cognitive Science, 27, 669−681.
,American Association for the Advancement of Science. (1993). Benchmarks for science literacy. New York: Oxford University Press.
Anderson, L.W., Krathwohl, D.R., Airasian, P.W., Cruikshank, K.A., Mayer, R.E., Pintrich, P.R., Raths, J., & Wittrock, C. (Eds.). (2001). A taxonomy for learning, teaching, and assessing – a revision of Bloom's Taxonomy of Educational Objectives. New York: Addison Wesley Longman.
Anderson, J.R. (2007) How can the human mind occur in the physical universe?New York: Oxford University Press.
Baron, J. (2000). Thinking and deciding (3rd ed.). New York: Cambridge University Press.
Baxter, G. & Glaser, R. (1998). Investigating the cognitive complexity of science assessments. Educational Measurement: Issues and Practices, 17, 37–45.
Bennett, R.E. & Gitomer, D.H. (2008). Transforming K-12 assessment: Integrating accountability testing, formative assessment, and professional support. ETS Research Memorandum-08–13, 1–30. Princeton, NJ: Educational Testing Service.
Birenbaum, M., Tatsuoka, C., & Yamada, T. (2004). Diagnostic assessment in TIMSS-R: Between countries and within-country comparisons of eighth graders' mathematics performance. Studies in Educational Evaluation, 30, 151–173.
Bishop, J. (1989). Is the test score decline responsible for the productivity growth decline?American Economic Review, 79, 178–97.
Bishop, J.H. (1991). Achievement, test scores, and relative wages. In Kosters, M. H. (Ed.), Workers and their wages (pp. 146–186). Washington, DC: The AEI Press.
Bloom, B., Englehart, M. Furst, E., Hill, W., & Krathwohl, D. (1956). Taxonomy of educational objectives: The classification of educational goals. Handbook I: Cognitive domain. New York: Longmans, Green.
Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary psychometrics. New York: Cambridge University Press.
Bransford, J.D., Brown, A.L., & Cocking, R.R. (2000). How people learn: Brain, mind, experience, and school: Expanded edition. Washington, DC: National Academy Press.
Briggs, D.C., Alonzo, A. C., Schwab, C., & Wilson, M. (2006). Diagnostic assessment with ordered multiple-choice items. Educational Assessment, 11(1), 33–63.
Carver, S.M. (2006). Assessing for deep understanding. In Sawyer, R.K. (Ed.), The Cambridge handbook of the learning sciences (pp. 205–221). New York: Cambridge University Press.
Chen, Y-H., Gorin, J.S., Thompson, M.S. & Tatsuoka, K.K. (2008). Cross-cultural validity of the TIMSS-1999 mathematics test: Verification of a cognitive model. International Journal of Testing, 8, 251–271.
Chi, M.T.H. (1997). Quantifying qualitative analyses of verbal data: A practical guide. The Journal of the Learning Sciences, 6, 271–315.
Chi, M., Feltovich, P., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121–152.
Chi, M.T.H., Glaser, R., & Farr, M. (Eds.). (1988). The nature of expertise. Hillsdale, NJ: Erlbaum.
Chomsky, N. (1959). Review of Skinner's verbal behavior. Language, 35, 26–58.
Collins, A., Halverson, R., & Brown, J.S. (2009). Rethinking Education in the Age of Technology. New York, NY: Teachers College Press.
Corter, J.E. & Tatsuoka, K.K. (2002). Cognitive and measurement foundations of diagnostic assessments in mathematics. College Board Technical Report. New York: Teachers College, Columbia University.
Cronbach, L.J. (1957). The two disciplines of scientific psychology. American Psychologist, 12, 671–684.
Lange, J. (2007). Large-scale assessment of mathematics education. In Lester, F.K. Jr. (Ed.), National Council of Teachers of Mathematics: Second handbook of research on mathematics teaching and learning (pp. 1111–1142). Charlotte, NC: Information Age Publishing.
Downing, S.M. & Haladyna, T.M. (Eds.). (2006). Handbook of test development. Mahwah, NJ: Erlbaum.
Embretson, S.E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.
Embretson, S.E. & Daniel, R.S. (2008). Understanding and quantifying cognitive complexity level in mathematical problem solving items. Psychology Science, 50, 328–344.
Embretson, S.E. & Wetzel, C.D. (1987). Component latent trait models for paragraph comprehension. Applied Psychological Measurement, 11, 175–193.
Ericsson, K.A. (2009). (Ed.). Development of professional expertise: Toward measurement of expert performance and design of optimal learning environments. New York, New York: Cambridge University Press.
Ericsson, K.A., Charness, N., Feltovich, P.J., & Hoffman, R.R. (Eds.). (2006). The Cambridge handbook of expertise and expert performance. Cambridge, UK: Cambridge University Press.
Ericsson, K.A. & Simon, H.A. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: The MIT Press.
Ericsson, K.A. & Smith, J. (Eds.). (1991). Toward a general theory of expertise: Prospects and limits. New York: Cambridge Press.
Ferrara, S. & DeMauro, G.E. (2006). Standardized assessment of individual achievement in K-12. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 579–621). Westport, CT: National Council on Measurement in Education and American Council on Education.
Gierl, M.J. (1997). Comparing the cognitive representations of test developers and students on a mathematics achievement test using Bloom's taxonomy. Journal of Educational Research, 91, 26–32.
Gierl, M.J., Leighton, J.P., Wang, C., Zhou, J., Gokiert, R., & Tan, A. (2009). Validating cognitive models of task performance in algebra on the SAT®. College Board Research Report No. 2009–3. New York: The College Board.
Gierl, M.J., Zheng, Y., & Cui, Y. (2008). Using the attribute hierarchy method to identify and interpret cognitive skills that produce group differences. Journal of Educational Measurement, 45, 65–89.
Gobbo, C. & Chi, M.T.H. (1986). How knowledge is structured and used by expert and novice children. Cognitive Development, 1, 221–237.
Gonzalez, E.J. & Miles, J.A. (2001). TIMSS 1999 user guide for the international database: IEA's repeat of the third international mathematics and science study at the eighth grade. Chestnut Hill, MA: TIMSS International Study Center, Boston College.
Gorin, J.S. (2009). Diagnostic Classification Models: Are they Necessary?Measurement: Interdisciplinary Research and Perspectives, 7(1), 30–33.
Gorin, J.S. (2006). Test Design with cognition in mind. Educational Measurement: Issues and Practice, 25(4), 21–35.
Gorin, J.S. & Embretson, S.E. (2006). Item difficulty modeling of paragraph comprehension items. Applied Psychological Measurement, 30, 394–411.
Greeno, J.G. (1983). Forms of understanding in mathematical problem solving. In Paris, S.G., Olson, G.M., Stevenson, H.W. (Eds.), Learning and motivation in the classroom (pp. 83–111). Hillsdale, NJ: Erlbaum.
Greeno, J.G. (2003). Measurement, trust, and meaning. Measurement: Interdisciplinary, Research, and Perspectives, 1, 260–263.
Grigg, W., Lauko, M., & Brockway, D. (2006). The nation's report card: Science 2005 (NCES 2006–466). U.S. Department of Education, National Center for Education Statistics. Washington, D.C.:U.S. Government Printing Office.
Hanushek, E.A. (2003). The failure of input-based schooling policies. The Economic Journal, 113, 64–98.
Hanushek, E.A. (2005). The economics of school quality. German Economic Review, 6, 269–286.
Hanushek, E.A. (2009). The economic value of education and cognitive skills. In Sykes, G., Schneider, B. & Plank, D. N. (Eds.), Handbook of education policy research (pp. 39–56). New York: Routledge.
Hussar, W.J. & Bailey, T.M. (2008). Projections of Education Statistics to 2017 (NCES 2008–078). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC.
Irvine, S.H. & Kyllonen, P.C. (2002). Item generation for test development. Mahwah, NJ: Lawrence Erlbaum.
Johnson-Laird, P.N. (1983). Mental models. Towards a cognitive science of language, inference, and consciousness. Cambridge, MA: Harvard University Press.
Johnson-Laird, P.N. & Bara, B.G. (1984). Syllogistic inference. Cognition, 16, 1–61.
Johnson-Laird, P.N. (2004). Mental models and reasoning. In Leighton, J.P. & Sternberg's, R.J. (Eds.), Nature of reasoning (pp. 169–204). Cambridge University Press.
Kane, M. (2006). Validation. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: National Council on Measurement in Education and American Council on Education.
Kilpatrick, J., Swafford, J. & Findell, B. (Ed.). (2001). Adding It Up: Helping Children Learn Mathematics. Washington, DC, USA: National Academies Press.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge, UK: Cambridge University Press.
Kirsch, I., Braun, H., & Yamamoto, K. (2007). America's perfect storm: Three forces changing our nation's future. ETS Policy Information Report. Educational Testing Service.
Klahr, D. & Dunbar, K. (1988). Dual search space during scientific reasoning. Cognitive Science, 12, 1–48.
Koedinger, K.R. & Anderson, J.R. (1990). Abstract planning and perceptual chunks: Elements of expertise in geometry. Cognitive Science, 14, 511–550.
Koretz, D.M. & Hamilton, L.S. (2006). Testing for accountability in K-12. In Brennan, R. (Ed.), Educational Measurement (4th ed., pp. 531–578). Washington, DC: American Council on Education.
Kuhn, D. (2001). How do people know?Psychological Science, 12, 1–8.
Kuhn, D. (2005). Education for thinking. Cambridge, MA: Harvard University Press.
Larkin, J. & Simon, H. (1987). Why a diagram is (sometimes) worth 10,000 words. Cognitive Science, 11, 65–99.
Leighton, J.P. (in press). A Cognitive Model for the Assessment of Higher Order Thinking in Students. To appear in Schraw, G. (Ed.), Current perspectives on cognition, learning, and instruction: Assessment of higher order thinking skills. Information Age Publishing.
Leighton, J.P., Cui, Y., & Cor, M.K. (2009). Testing expert-based and student-based cognitive models: An application of the attribute hierarchy method and hierarchical consistency index. Applied Measurement in Education, 22, 1–26.
Leighton, J.P., Gierl, M.J., & Hunka, S. (2004). The attribute hierarchy method for cognitive assessment: A Variation on Tatsuoka's rule-space approach. Journal of Educational Measurement, 41, 205–236.
Leighton, J.P. & Gierl, M.J. (2007). Defining and evaluating models of cognition used in educational measurement to make inferences about examinees' thinking processes. Educational Measurement: Issues and Practice, 26, 3–16.
Leighton, J.P., Heffernan, C., Cor, M.K., Gokiert, R., & Cui, Y. (in press). An experimental test of student verbal reports and teacher evaluations as a source of validity evidence for test development. Applied Measurement in Education.
Leighton, J.P. & Sternberg, R.J. (2003). Reasoning and problem solving. In Healy, A.F. & Proctor, R.W. (Volume Eds.), Experimental psychology (pp. 623–648). Volume 4 in I. B. Weiner (Editor-in-Chief) Handbook of psychology. New York: Wiley.
Lohman, D.F. & Nichols, P. (2006). Meeting the NRC panel's recommendations: Commentary on the papers by Mislevy and Haertel, Gorin, and Abedi and Gandara. Educational Measurement: Issues and Practice, 25, 58–64.
Manski, C.F. & Wise, D.A. (1983). College choice in America. Cambridge, MA: Harvard University Press.
Mislevy, R.J. (1993). Foundations of a new test theory. In Frederiksen, N., Mislevy, R.J., & Bejar, I.I. (Eds.), Test theory for a new generation of tests (pp. 19–39). Hillsdale, NJ: Lawrence Erlbaum.
Mislevy, R.J. (2006). Cognitive psychology and educational assessment. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 257–305). Westport, CT: National Council on Measurement in Education and American Council on Education.
Mislevy, R.J. & Behrens, J.T., Bennett, R.E., Demark, S.F., Frezzo, D.C., Levy, R., Robinson, D.H., Rutstein, D.W., Shute, V.J., Stanley, K., & Winters, F.I. (2010). On the roles of external knowledge representations in assessment design. Journal of Technology, Learning, and Assessment, 8(2).
Mislevy, R.J. & Haertel, G. (2006). Implications for evidence-centered design for educational assessment. Educational Measurement: Issues and Practice, 25, 6–20.
Mislevy, R.J., Steinberg, L.S., & Almond, R.G. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67.
Murnane, R.J., Willett, J.B., Duhaldeborde, Y. & Tyler, J.H. (2000). How important are the cognitive skills of teenagers in predicting subsequent earnings?Journal of Policy Analysis and Management, 19, 547–68.
,National Commission on Excellence in Education. (1983). A nation at risk: The imperative for educational reform. Washington, DC: U.S. Government Printing Office.
,National Mathematics Advisory Panel. (2008). Foundations for Success: The Final Report of the National Mathematics Advisory Panel. U.S. Department of Education: Washington, DC.
,National Research Council (2001). Knowing what students know: The science and design of educational assessment. Committee on the Foundations of Assessment. Pellegrino, J., Chudowsky, N., and Glaser, R. (Eds.). Board on Testing and Assessment, Center for Education. Washington, DC: National Academy Press.
,National Research Council. National Research Council (2005). How Students Learn: History, Mathematics, and Science in the Classroom. In Donovan, M.S. & Bransford, J.D. (Eds.), Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
,National Research Council. National Research Council (2007). Taking science to school: Learning and teaching science in grades K-8. Arlington, VA: National Science Foundation.
,National Research Council. National Research Council (2009). Mathematics Learning in Early Childhood: Paths Toward Excellence and Equity. Committee on Early Childhood Mathematics, Cross, Christopher T., Woods, Taniesha A., and Schweingruber, Heidi, Editors. Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
Newcombe, N.S., Ambady, N., Eccles, J., Gomez, L., Klahr, D., Linn, M., Miller, K., & Mix, K. (2009). Psychology's role in mathematics and science education. American Psychologist, 64, 538–550.
Newell, A. & Simon, H. A. (1972). Human problem solving. NJ: Prentice Hall.
Times, New York (Nov. 14, 2007), Study compares states' math and science scores with other countries'.
No Child Left Behind Act of 2002, Pub Law No. 107–110 (2002, January). Retrieved April 11, 2009 from 107–110.pdf
,Organization for Economic Cooperation and Development (2003). Program for International Student Assessment (PISA). Author.
,Organization for Economic Cooperation and Development (2007). PISA 2006: Science Competencies for Tomorrow's World Executive Summary. Author. Accessed from world wide web on March 26, 2010, at
,Organization for Economic Cooperation and Development (2009). Education today: The OECD perspective. Author.
Pashler, H., Rohrer, D., Cepeda, N. & Carpenter, S. (2007). Enhancing learning and retarding forgetting: Choices and consequences. Psychonomic Bulletin & Review, 14, 187–193.
Pellegrino, J.W., BaxterG.P., G.P., & Glaser, R., (1999). Addressing the “Two Disciplines” problem: linking theories of cognition and learning with assessment and instructional practice. Review of Research in Education, 24, 307–352.
Piaget, J. & Inhelder, B. (1967). The child's conception of space. New York: W. W. Norton &Co.
Phillips, G.W. (2007). Expressing international educational achievement in terms of U.S. performance standards: Linking NAEP achievement levels to TIMSS. Washington, DC: American Institutes for Research.
Polikoff, M. (2010). Instructional sensitivity as a psychometric property of assessments. Educational Measurement: Issues and Practice, 29, 3–14.
Provasnik, S., Gonzales, P., & Miller, D. (2009). U.S. Performance Across International Assessments of Student Achievement: Special Supplement to The Condition of Education 2009 (NCES 2009–083). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education. Washington, DC.
Resnick, L.B. (1983). Toward a cognitive theory of instruction. In Paris, S.G., Olson, G.M., Stevenson, H.W. (Eds.), Learning and motivation in the classroom (pp. 5–38). Hillsdale, NJ: Erlbaum.
Russell, J.F. (2005). Evidence related to awareness, adoption, and implementation of the standards for technological literacy: Content for the study of technology. The Journal of Technology Studies, 31, 30–38.
Rupp, A.A. (2007). The answer is in the question: A guide for describing and investigating the conceptual foundations and statistical properties of cognitive psychometric models. International Journal of Testing, 7, 95–125.
Rupp, A.A. & Templin, J.L. (2008). Unique characteristics of diagnostic classification models: A comprehensive review of the current state-of-the-art. Measurement, 2, 219–262.
Sawyer, R.K. (Ed.). (2006). The Cambridge handbook of the learning sciences. New York: Cambridge University Press.
Shute, V.J., Hansen, E.G., & Almond, R.G. (2008). You can't fatten a hog by weighing it – or can you? Evaluating an assessment for learning system called ACED. International Journal of Artificial Intelligence in Education, 18, 289–316.
Shute, V.J., Masduki, I., Donmez, O., Kim, Y.J., Dennen, V.P., Jeong, A.C., & Wang, C-Y. (2010). Modeling, assessing, and supporting key competencies within game environments. In Ifenthaler, D., Pirnay-Dummer, P., & Seel, N.M. (Eds.), Computer-based diagnostics and systematic analysis of knowledge (pp. 281–310). New York, NY: Springer.
Siegler, R.S. (2005). Children's learning. American Psychologist, 60, 769–778.
Schmeiser, C.B. & Welch, C.J. (2006). Test development. In Brennan, R. L. (Ed.), Educational measurement (4th ed., pp. 307–353). Westport, CT: National Council on Measurement in Education and American Council on Education.
Shrager, J. & Siegler, R.S. (1998). SCADS: A model of children's strategy choices and strategy discoveries. Psychological Science, 9, 405–410.
Slotta, J.D. & Chi, M.T.H. (2006). Helping students understand challenging topics in science through ontology training. Cognition and Instruction, 24, 261–289.
Snow, R.E. & Lohman, D.F. (1989). Implications of cognitive psychology for educational measurement. In Linn, R. L. (Ed.), Educational measurement (3rd ed., pp. 263–331). New York: American Council on Education, Macmillan.
Stanovich, K.E. (2009). What intelligence tests miss: The psychology of rational thought. New Haven, CT: Yale University Press.
Tatsuoka, K.K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 20, 345–354.
Tatsuoka, K.K. (1985). A probabilistic model for diagnostic misconceptions in the pattern classification approach. Journal of Educational Statistics, 10, 55–73.
Tatsuoka, K.K. (1995). Architecture of knowledge structures and cognitive diagnosis: A statistical pattern recognition and classification approach. In Nichols, P. D., Chipman, S. F.., & Brennan, R. L. (Eds.), Cognitively diagnostic assessment (pp. 327–359). Hillsdale, NJ: Lawrence Erlbaum Associates.
Tatsuoka, K.K. (2009). Cognitive assessment. An introduction to the rule space method. New York City, New York: Routledge, Taylor & Francis.
Tatsuoka, K.K., Corter, J.E., & Guerrero, A. (2004). Coding manual for identifying involvement of content, skill, and process subskills for the TIMSS-R 8th grade and 12th grade general mathematics test items (Technical Report). New York: Department of Human Development, Teachers College, Columbia University.
Tatsuoka, K.K., Corter, J.E., & Tatsuoka, C. (2004). Patterns of diagnosed mathematical content and process skills in TIMSS-R across a sample of 20 countries. American Educational Research Journal, 41, 901–926.
Thomas, D., Li, Q., Knott, L., & Zhongxiao, L. (2008). The structure of student dialogue in web-assisted mathematics courses. Journal of Educational Technology Systems, 36, 415–431.
Tubau, E. (2008). Enhancing probabilistic reasoning: The role of causal graphs, statistical format and numerical skills. Learning and Individual Differences, 18, 187–196.
Tversky, B. (2002). Some ways that graphics communicate. In Allen, N. (Editor), Words and images: New steps in an old dance, (pp. 57–74). Westport, CT: Ablex.
,U.S. Department of Education, National Center for Education Statistics (2008). Digest of education statistics, 2008, Chapter 6 (NCES 2008–022).
Wilson, M. & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13, 181–208.
Zhou, L. (2009). Revenues and Expenditures for Public Elementary and Secondary Education: School Year 2006–07 (Fiscal Year 2007). (NCES 2009–337). National Center for Education Statistics, Institute of Education Sciences, U.S. Department of Education.