Skip to main content Accessibility help
Hostname: page-component-77c89778f8-m8s7h Total loading time: 0 Render date: 2024-07-19T21:45:41.076Z Has data issue: false hasContentIssue false

7 - Integrated Model of Text and Picture Comprehension

from Part II - Theoretical Foundations

Published online by Cambridge University Press:  19 November 2021

Richard E. Mayer
University of California, Santa Barbara
Logan Fiorella
University of Georgia
Get access


An integrated model of text and picture comprehension is presented in this chapter which takes into account that learners can use multiple sensory modalities combined with different forms of representation. The model encompasses listening comprehension, reading comprehension, visual picture comprehension, and auditory picture comprehension (i.e., sound comprehension). The model’s cognitive architecture consists of modality-specific sensory registers, working memory, and long-term memory. Within this architecture, a distinction is made between perception-bound processing of text surface or picture surface structures, on the one hand, and cognitive processing of semantic deep structures, on the other hand.

Publisher: Cambridge University Press
Print publication year: 2021

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Adams, B. C., Bell, L., & Perfetti, C. (1995). A trading relationship between reading skill and domain knowledge in children’s text comprehension. Discourse Processes, 20, 307323.Google Scholar
Ainsworth, S. (1999). The functions of multiple representations. Computers & Education, 33, 131152.CrossRefGoogle Scholar
Atkinson, C., & Shiffrin, R. M. (1971). The control of short-term memory. Scientific American, 225, 8290.Google Scholar
Baddeley, A. D. (1986). Working Memory. Oxford: Clarendon Press.Google ScholarPubMed
Baddeley, A. D. (1999). Essentials of Human Memory. Hove: Psychology Press.Google Scholar
Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Science, 4, 417423.Google Scholar
Caramazza, A., Berndt, R. S., & Basili, A. G. (1983). The selective impairment of phonological processing: A case study. Brain and Language, 18, 128174.Google Scholar
Carney, R. N., & Levin, J. R. (2002). Pictorial illustrations still improve students’ learning from text. Educational Psychology Review, 14, 526.Google Scholar
Chandler, P., & Sweller, J. (1996). Cognitive load while learning to use a computer program. Applied Cognitive Psychology, 10, 151170.3.0.CO;2-U>CrossRefGoogle Scholar
Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108(1), 204256.Google Scholar
Comenius, J. A. (1999). Orbis sensualium pictus [Facsimile of the 1887 edition]. Whitefish, MT: Kessinger.Google Scholar
Cooney, J. B., & Swanson, H. L. (1991). Learning disabilities and memory. In Wong, B. Y. L. (ed.), Learning about Learning Disabililities (pp. 103127). Cambridge, MA: Academic Press.Google Scholar
Daneman, M., & Carpenter, P. A. (1983). Individual differences in integrating information between and within sentences. Journal of Experimental Psychology: Learning, Memory and Cognition, 9, 561583.Google Scholar
Dutke, S. (1996). Generic and generative knowledge: Memory schemata in the construction of mental models. In Battmann, W., & Dutke, S. (eds.), Processes of the Molar Regulation of Behavior (pp. 3554). Lengerich: Pabst Science.Google Scholar
Eitel, A., & Scheiter, K. (2015). Picture or text first? Explaining sequence effects when learning with pictures and text. Educational Psychology Review, 27, 153180.CrossRefGoogle Scholar
Eitel, A., Scheiter, K., Schüler, A., Nyström, M., & Holmqvist, K. (2013). How a picture facilitates the process of learning from text: Evidence for scaffolding. Learning and Instruction, 28, 4863.Google Scholar
Ellis, A. W., & Young, A. W. (1996). Human Cognitive Neuropsychology: A Textbook with Readings. Hove: Psychology Press.Google Scholar
Friedman, N. P., & Miyake, A. (2000). Differential roles for visuospatial and verbal working memory in situation model construction. Journal of Experimental Psychology: General, 129, 6183.Google Scholar
Ginns, P. (2005). Meta-analysis of the modality effect. Learning & Instruction, 15, 313331.Google Scholar
Graesser, A. C., Millis, K. K., & Zwaan, R. A. (1997). Discourse comprehension. Annual Review of Psychology, 48, 163189.Google Scholar
Gyselinck, V., Jamet, E., & Dubois, V. (2008). The role of working memory components in multimedia comprehension. Applied Cognitive Psychology, 22, 353374.CrossRefGoogle Scholar
Harp, S. F., & Mayer, R. E. (1998). How seductive details do their damage: A theory of cognitive interest in science learning. Journal of Educational Psychology, 90(3), 414434.CrossRefGoogle Scholar
Hochpöchler, U., Schnotz, W., Rasch, T., Ullrich, M., Horz, H., McElvany, N., Schroeder, S., & Baumert, J. (2013). Dynamics of mental model construction from text and graphics. European Journal of Psychology of Education, 28(4), 11051126.CrossRefGoogle Scholar
Johnson-Laird, P. N. (1983). Mental Models. Cambridge: Cambridge University Press.Google Scholar
Kalyuga, S., Chandler, P., & Sweller, J. (2000). Incorporating learner experience into the design of multimedia instruction. Journal of Educational Psychology, 92, 126136.Google Scholar
Kintsch, W. (1998). Comprehension: A Paradigm for Cognition. Cambridge: Cambridge University Press.Google Scholar
Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363394.CrossRefGoogle Scholar
Kirby, J. R., Moore, P. J., & Schofield, N. J. (1988). Verbal and visual learning styles. Contemporary Educational Psychology, 13, 169184.CrossRefGoogle Scholar
Knauff, M., & Johnson-Laird, P. (2002). Visual imagery can impede reasoning. Memory & Cognition, 30, 363371.CrossRefGoogle ScholarPubMed
Kosslyn, S. M. (1994). Image and Brain. Cambridge, MA: MIT Press.Google Scholar
Kulhavy, R. W., Stock, W. A., & Caterino, L. C. (1994). Reference maps as a framework for remembering text. In Schnotz, W., & Kulhavy, R. W. (eds.), Comprehension of Graphics (pp. 153162). Amsterdam: Elsevier Science.CrossRefGoogle Scholar
Larkin, J. H., & Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words. Cognitive Science, 11, 6599.Google Scholar
Leahy, W., Chandler, P., & Sweller, J. (2003). When auditory presentations should and should not be a component of multimedia instruction. Applied Cognitive Psychology, 17, 401418.Google Scholar
Lenzner, A., Schnotz, W., & Müller, A. (2013). The role of decorative pictures in learning. Instructional Science, 41(5), 811831.Google Scholar
Levin, J. R., Anglin, G. J., & Carney, R. N. (1987). On empirically validating functions of pictures in prose. In Willows, D. M., & Houghton, H. A. (eds.), The Psychology of Illustration (Vol. 1, pp. 5186). New York: Springer.Google Scholar
Lindner, M. A., Eitel, A., Strobel, B., & Köller, O. (2017). Identifying processes underlying the multimedia effect in testing: An eye-movement analysis. Learning and Instruction, 47, 91102.Google Scholar
Lowe, R. K. (1996). Background knowledge and the construction of a situational representation from a diagram. European Journal of Psychology of Education, 11, 377397.CrossRefGoogle Scholar
Marr, D. (1982). Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco, CA: Freeman.Google Scholar
Mastropieri, M. A., & Scruggs, T. E. (1989). Constructing more meaningful relationships: Mnemonic instruction for special populations. Educational Psychology Review, 1, 83111.Google Scholar
Mayer, R. E. (1997). Multimedia learning: Are we asking the right questions? Educational Psychologist, 32, 119.Google Scholar
Mayer, R. E. (2009). Multimedia Learning (2d ed.). New York: Cambridge University Press.Google Scholar
Mayer, R. E., & Gallini, J. K. (1990). When is an illustration worth ten thousand words? Journal of Educational Psychology, 82, 715726.Google Scholar
Mayer, R. E., & Moreno, R. (1998). A split-attention effect in multimedia learning: Evidence for dual processing systems in working memory. Journal of Educational Psychology, 90, 312320.CrossRefGoogle Scholar
McNamara, D. S. (ed.) (2007). Reading Comprehension Strategies: Theories, Interventions, and Technologies. New York: Lawrence Erlbaum.Google Scholar
McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14, 143.Google Scholar
Miller, L. M. S., & Stine-Morrow, E. A. L. (1998). Aging and the effects of knowledge on on-line reading strategies. Journal of Gerontology: Psychology Sciences, 53B, 223233.CrossRefGoogle Scholar
Moreno, R., & Mayer, R. E. (1999). Cognitive principles of multimedia learning: The role of modality and contiguity. Journal of Educational Psychology, 91, 358368.CrossRefGoogle Scholar
Mousavi, S. Y., Low, R., & Sweller, J. (1995). Reducing cognitive load by minimizing auditory and visual presentation modes. Journal of Educational Psychology, 87, 319334.Google Scholar
Paivio, A. (1986). Mental Representations: A Dual Coding Approach. Oxford: Oxford University Press.Google Scholar
Palmer, S. E., Rosch, E., & Chase, P. (1981). Canonical perspective and the perception of objects. In Long, J., & Baddeley, A. (eds.), Attention and Performance (Vol. 9, pp. 135151). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Peirce, C. S. (1931–1958). Collected Writings, 8 vols. (ed. Hartshorne, C., Weiss, P., & Burks, A. W). Cambridge, MA: Harvard University Press.Google Scholar
Perfetti, C. A., & Britt, M. A. (1995). Where do propositions come from? In Weaver, C. A. III, Mannes, S., & Fletcher, C. R. (eds.), Discourse Comprehension: Essays in Honor of Walter Kintsch (pp. 1134). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Plass, J. L., Chun, D. M., Mayer, R. E., & Leutner, D. (1998). Supporting visual and verbal learning preferences in a second-language multimedia learning environment. Journal of Educational Psychology, 90, 2536.CrossRefGoogle Scholar
Pozzer, L. L., & Roth, W.-M. (2003). Prevalence, function and structure of photographs in high school biology textbooks. Journal of Research in Science Teaching, 40(10), 10891114.Google Scholar
Rieben, L., & Perfetti, C. (1991). Learning to Read: Basic Research and its Implications. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Rosch, E. (1978). Principles of categorization. In Rosch, E., & Lloyd, B. B. (eds.), Cognition and Categorization (pp. 2748). Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Rummer, R., Schweppe, J., Fürstenberg, A., Seufert, T., & Brünken, R. (2010). Working memory interference during processing texts and pictures: Implications for the explanation of the modality effect. Applied Cognitive Psychology, 24, 164176.Google Scholar
Sanchez, C. A., & Wiley, J. (2006). An examination of the seductive details effect in terms of working memory capacity. Memory & Cognition, 34(2), 344355.Google Scholar
Schnotz, W. (2011). Colorful bouquets in multimedia research: A closer look at the modality effect. Zeitschrift für Pädagogische Psychologie, 25, 269276.CrossRefGoogle Scholar
Schnotz, W., & Bannert, M. (1999). Einflüsse der Visualisierungsform auf die Konstruktion mentaler Modelle beim Bild- und Textverstehen [Effects of the visualization form on the construction of mental models in picture and text comprehension]. Zeitschrift für Experimentelle Psychologie, 46, 216235.Google Scholar
Schnotz, W., & Bannert, M. (2003). Construction and interference in learning from multiple representations. Learning and Instruction, 13, 141156.Google Scholar
Schnotz, W., & Kürschner, C. (2008). External and internal representations in the acquisition and use of knowledge: Visualization effects on mental model construction. Instructional Science, 36, 175190.Google Scholar
Schnotz, W., & Wagner, I. (2018). Construction and elaboration of mental models through strategic conjoint processing of text and pictures. Journal of Educational Psychology, 110(6), 850863.CrossRefGoogle Scholar
Schüler, A., Scheiter, K., & Schmidt-Weigand, F. (2011). Boundary conditions and constraints of the modality effect. Zeitschrift für Pädagogische Psychologie, 25, 211220.Google Scholar
Sims, V. K., & Hegarty, M. (1997). Mental animation in the visuospatial sketchpad: Evidence from dual-tasks studies. Memory & Cognition, 25, 321332.Google Scholar
Soederberg Miller, L. M. (2001). The effects of real-world knowledge on text processing among older adults. Aging, Neuropsychology and Cognition, 8, 137148.Google Scholar
Stiller, K. D., Freitag, A., Zinnbauer, P., & Freitag, C. (2009). How pacing of multimedia instructions can influence modality effects: A case of superiority of visual texts. Australasian Journal of Educational Technology, 25, 184203.Google Scholar
Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive Load Theory. New York: Springer.Google Scholar
Sweller, J., van Merriënboer, J. G., & Paas, F. G. W. C. (1998). Cognitive architecture and instructional design. Educational Psychological Review, 10, 251296.Google Scholar
Takahashi, S. (1995). Aesthetic properties of pictorial perception. Psychological Review, 102(4), 671683.Google Scholar
Vallar, G., & Shallice, T. (eds.) (1990). Neuropsychological Impairments of Short-Term Memory. Cambridge: Cambridge University Press.Google Scholar
van Dijk, T. (1980). Macrostructures: An Interdisciplinary Study of Global Structures in Discourse, Interaction, and Cognition. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
van Dijk, T. A., & Kintsch, W. (1983). Strategies of Discourse Comprehension. New York: Academic Press.Google Scholar
van Oostendorp, H., & Goldman, S. R. (eds.) (1999). The Construction of Mental Representations during Reading. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Weaver, C. A., III, Mannes, S., & Fletcher, C. R. (1995). Discourse Comprehension. Essays in Honor of Walter Kintsch. Hillsdale, NJ: Lawrence Erlbaum.Google Scholar
Zhao, F., Schnotz, W., Wagner, I., & Gaschler, R. (2020). Eyes on text and pictures – Construction and adaptation of mental models. Memory & Cognition, 48(1), 6982.CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the or variations. ‘’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats