Skip to main content Accessibility help
×
Hostname: page-component-848d4c4894-tn8tq Total loading time: 0 Render date: 2024-06-28T01:11:40.513Z Has data issue: false hasContentIssue false

34 - Modeling Vision

from Part IV - Computational Modeling in Various Cognitive Fields

Published online by Cambridge University Press:  21 April 2023

Ron Sun
Affiliation:
Rensselaer Polytechnic Institute, New York
Get access

Summary

Vision is one of the most complex proficiencies we possess, but its underpinnings are still shrouded in mystery. Many great scientific minds have been engaged in the enterprise of modeling vision. This chapter takes a look at some of the history of this effort, stretching from the times of the ancient Greeks to recent developments in neural networks, and discusses how current techniques may play a role in furthering our understanding of vision.

Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Adamson, P. (2016). Philosophy in the Islamic World: A History of Philosophy Without Any Gaps. Oxford: Oxford University Press.Google Scholar
Avicenna, . (1973). A Treatise on the Canon of Medicine of Avicenna. Trans. O. Cameron Gruner. New York, NY: AMS Press.Google Scholar
Berkeley, G. (1709). An Essay towards a New Theory of Vision. Dublin: Aaron Rhames.Google Scholar
Cadieu, C. F., Hong, H., Yamins, D. L., et al. (2014). Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput Biol, 10 (12), e1003963.Google Scholar
Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., & Urtasun, R. (2016). Monocular 3D object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2147–2156).Google Scholar
Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A., & Oliva, A. (2016). Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific Reports, 6, 27755.CrossRefGoogle ScholarPubMed
Cranefield, P. F. (1970). On the origin of the phrase Nihil est in intellectu quod non prius fuerit in sensu. Journal of the History of Medicine, 25 (1), 7780.Google Scholar
Crick, F. (1989). The recent excitement about neural networks. Nature, 337 (6203), 129132.Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255).Google Scholar
Descartes, R. (1985). Treatise on man. In The Philosophical Writings of Rene Descartes (Vol. 1, pp. 99107). Cambridge: Cambridge University Press.Google Scholar
Doerig, A., Schmittwilken, L., Sayim, B., Manassi, M., & Herzog, M. H. (2020a). Capsule networks as recurrent models of grouping and segmentation. PLoS Computational Biology, 16 (7), e1008017.Google Scholar
Doerig, A., Bornet, A., Choung, O. H., & Herzog, M. H. (2020b). Crowding reveals fundamental differences in local vs. global processing in humans and machines. Vision Research, 167, 3945.Google Scholar
Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In IEEE CVPR Workshop on Generative-Model Based Vision.Google Scholar
Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 147.CrossRefGoogle ScholarPubMed
Finger, S. (1994). Origins of Neuroscience: A History of Explorations into Brain Function (pp. 67–69). Oxford: Oxford University Press.CrossRefGoogle Scholar
Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36, 193202. https://doi.org/10.1007/BF00344251Google Scholar
Fukushima, K., & Miyake, S. (1982). Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and Cooperation in Neural Nets (pp. 267–285). Berlin and Heidelberg: Springer.Google Scholar
Galen, . (1968). Galen on the Usefulness of the Parts of the Body. Trans. Margaret Tallmadge May. Ithaca, NY: Cornell University Press.Google Scholar
Geirhos, R., Temme, C. R., Rauber, J., Schütt, H. H., Bethge, M., & Wichmann, F. A. (2018a). Generalisation in humans and deep neural networks. In Advances in Neural Information Processing Systems (pp. 7538–7550).Google Scholar
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018b). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231.Google Scholar
Grüsser, O. J., & Hagner, M. (1990). On the history of deformation phosphenes and the idea of internal light generated in the eye for the purpose of vision. Documenta Ophthalmologica, 74 (1–2), 5785.Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).Google Scholar
Hinton, G. E., & Sejnowski, T. J. (1983). Optimal perceptual inference. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 448–453). Washington, DC: IEEE Computer Society.Google Scholar
Hochberg, J., & Brooks, V. (1962). Pictorial recognition as an unlearned ability: A study of one child’s performance. The American Journal of Psychology, 75 (4), 624628.Google Scholar
Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79 (8), 25542558.Google Scholar
Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.Google Scholar
Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat’s striate cortex. Journal of Physiology, 148, 574591.Google Scholar
Hubel, D. H., & Wiesel, T. N. (1977). Ferrier Lecture: functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society B: Biological Sciences, 198, 159.Google Scholar
Hubel, D. H., & Wiesel, T. N. (1998). Early exploration of the visual cortex. Neuron, 20, 401412.CrossRefGoogle ScholarPubMed
Hubel, D. H., & Wiesel, T. N. (2005). Brain and Visual Perception: The Story of a 25-Year Collaboration. New York, NY: Oxford University Press.Google Scholar
Huttenlocher, P. R., de Courten, C., Garey, L. J., & Van der Loos, H. (1982). Synaptogenesis in human visual cortex – evidence for synapse elimination during normal development. Neuroscience Letters, 33, 247252.CrossRefGoogle ScholarPubMed
Kant, I. (1781). Critique of Pure Reason (pp. 370456). Modern Classical Philosophers. Cambridge, MA: Houghton Mifflin.Google Scholar
Khaligh-Razavi, S. M., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Computational Biology, 10 (11), e1003915.Google Scholar
Kheradpisheh, S. R., Ghodrati, M., Ganjtabesh, M., & Masquelier, T. (2016). Deep networks resemble human feed-forward vision in invariant object recognition. arXiv preprint arXiv:1508.03929Google Scholar
Kietzmann, T. C., Spoerer, C. J., Sörensen, L. K., Cichy, R. M., Hauk, O., & Kriegeskorte, N. (2019). Recurrence is required to capture the representational dynamics of the human visual system. Proceedings of the National Academy of Sciences, 116 (43), 2185421863.Google Scholar
Koffka, K. (1935). Principles of Gestalt Psychology (p. 176). New York, NY: Harcourt, Brace.Google Scholar
Kreiman, G., & Serre, T. (2020). Beyond the feedforward sweep: feedback computations in the visual cortex. Annals of the New York Academy of Sciences, 1464 (1), 222241.Google Scholar
Kriegeskorte, N. (2015). Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 24 (1), 417446.Google Scholar
Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis: connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4.Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 10971105.Google Scholar
Kuffler, S. W. (1953). Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16 (1), 3768.Google Scholar
Lake, B. M., Zaremba, W., Fergus, R., & Gureckis, T. M. (2015). Deep neural networks predict category typicality ratings for images. In Proceedings of the 37th Annual Conference of the Cognitive Science Society.Google Scholar
Land, E. H., & McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America, 61 (1), 111.Google Scholar
Lappe, M., & Rauschecker, J. P. (1993). A neural network for the processing of optic flow from ego-motion in man and higher mammals. Neural Computation, 5 (3), 374391.Google Scholar
Lee, W. C., & Reid, R. C. (2011). Specificity and randomness: structure-function relationships in neural circuits. Current Opinion in Neurobiology, 21 (5), 801807.CrossRefGoogle ScholarPubMed
Locke, J. (1690). An essay concerning human understanding. In Dennis, W. (Ed.), Readings in the History of Psychology (pp. 5568). New York, NY: Appleton-Century-Crofts.Google Scholar
Lotter, W., Kreiman, G., & Cox, D. (2020). A neural network trained for prediction mimics diverse features of biological neurons and perception. Nature Machine Intelligence, 2 (4), 210219.Google Scholar
Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. New York, NY: Henry Holt.Google Scholar
Marr, D., & Poggio, T. (1976). Cooperative computation of stereo disparity. Science, 194 (4262), 283287.Google Scholar
Minsky, M. L., & Papert, S. A. (1969). Perceptrons. Cambridge, MA: MIT Press.Google Scholar
Ng, H. W., & Winkler, S. (2014). A data-driven approach to cleaning large face datasets. In IEEE International Conference on Image Processing (ICIP) (pp. 343–347).Google Scholar
Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS Computational Biology, 10 (4), e1003553.Google Scholar
Reymond, A. (1927). History of the Sciences in Greco-Roman Antiquity (p. 182). London: Methuen.Google Scholar
Rosenblatt, F. (1958). The Perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 65 (6), 386408. https://doi.org/10.1037/h0042519Google Scholar
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323 (6088), 533536.Google Scholar
Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: a unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 815–823).Google Scholar
Szegedy, C., Zaremba, W., Sutskever, I., et al. (2014). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.Google Scholar
Tang, H., Schrimpf, M., Lotter, W., et al. (2018). Recurrent computations for visual pattern completion. Proceedings of the National Academy of Sciences, 115 (35), 88358840.Google Scholar
Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520522.Google Scholar
Titchener, E. B. (1929). Systematic Psychology: Prolegomena. New York: Macmillan.Google Scholar
Vanderah, T. W., & Gould, D. J. (2016). Nolte’s: The Human Brain (7th ed.). Philadelphia, PA: Elsevier.Google Scholar
Vogelsang, L., Gilad-Gutnick, S., Ehrenberg, E., et al. (2018). Potential downside of high initial visual acuity. Proceedings of the National Academy of Sciences, 115 (44), 1133311338.Google Scholar
von Helmholtz, H. (1925). Handbuch der Physiologischen Optik, English translation, Southall, J. P. D. (Ed.) (p. 455). Rochester, NY: Optical Society of America.Google Scholar
Wertheimer, M. (1938). [Original work published 1924]. Gestalt theory. In Ellis, W. D. (Ed.), A Source Book of Gestalt Psychology. London: Routledge & Kegan Paul.Google Scholar
Wilson, H. R. (1993). Theories of infant visual development. In Simons, K. (Ed.), Early Visual Development: Normal and Abnormal (pp. 560569). New York, NY: Oxford University Press.Google Scholar
Winer, G. A., Cottrell, J. E., Gregg, V., Fournier, J. S., & Bica, L. A. (2002). Fundamentally misunderstanding visual perception: adults’ beliefs in visual emissions. American Psychologist, 57, 417424.Google Scholar
Wundt, W. M. (1897). Outlines of Psychology. Leipzig: Wilhelm Engelmann.Google Scholar
Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: an overview and application in radiology. Insights into Imaging, 9 (4), 611629.Google Scholar
Yamins, D. L., Hong, H., Cadieu, C. F., Solomon, E. A., Seibert, D., & DiCarlo, J. J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111 (23), 86198662.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Modeling Vision
  • Edited by Ron Sun, Rensselaer Polytechnic Institute, New York
  • Book: The Cambridge Handbook of Computational Cognitive Sciences
  • Online publication: 21 April 2023
  • Chapter DOI: https://doi.org/10.1017/9781108755610.039
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Modeling Vision
  • Edited by Ron Sun, Rensselaer Polytechnic Institute, New York
  • Book: The Cambridge Handbook of Computational Cognitive Sciences
  • Online publication: 21 April 2023
  • Chapter DOI: https://doi.org/10.1017/9781108755610.039
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Modeling Vision
  • Edited by Ron Sun, Rensselaer Polytechnic Institute, New York
  • Book: The Cambridge Handbook of Computational Cognitive Sciences
  • Online publication: 21 April 2023
  • Chapter DOI: https://doi.org/10.1017/9781108755610.039
Available formats
×