Models of vision need some action

Constantin Rothkopf; Frank Bremmer; Katja Fiehler; Katharina Dobs; Jochen Triesch

doi:10.1017/S0140525X23001577

Models of vision need some action

Published online by Cambridge University Press: 06 December 2023

Katharina Dobs and

Constantin Rothkopf: Affiliation:
Centre for Cognitive Science, Technical University of Darmstadt, Darmstadt, Germany constantin.rothkopf@cogsci.tu-darmstadt.de Frankfurt Institute for Advanced Studies, Goethe-Universität Frankfurt, Frankfurt am Main, Germany triesch@fias.uni-frankfurt.de Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany HMWK-Clusterproject The Adaptive Mind, Hesse, Germany https://www.theadaptivemind.de/
Frank Bremmer: Affiliation:
Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany HMWK-Clusterproject The Adaptive Mind, Hesse, Germany https://www.theadaptivemind.de/ Applied Physics and Neurophysics, University of Marburg, Marburg, Germany frank.bremmer@physik.uni-marburg.de
Katja Fiehler: Affiliation:
Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany HMWK-Clusterproject The Adaptive Mind, Hesse, Germany https://www.theadaptivemind.de/ Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany Katja.Fiehler@psychol.uni-giessen.de katharina.dobs@psychol.uni-giessen.de
Katharina Dobs: Affiliation:
Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany HMWK-Clusterproject The Adaptive Mind, Hesse, Germany https://www.theadaptivemind.de/ Experimental Psychology, Justus Liebig University Giessen, Giessen, Germany Katja.Fiehler@psychol.uni-giessen.de katharina.dobs@psychol.uni-giessen.de
Jochen Triesch: Affiliation:
Frankfurt Institute for Advanced Studies, Goethe-Universität Frankfurt, Frankfurt am Main, Germany triesch@fias.uni-frankfurt.de Center for Mind, Brain and Behavior, University of Marburg and Justus Liebig University Giessen, Giessen, Germany HMWK-Clusterproject The Adaptive Mind, Hesse, Germany https://www.theadaptivemind.de/

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Bowers et al. focus their criticisms on research that compares behavioral and brain data from the ventral stream with a class of deep neural networks for object recognition. While they are right to identify issues with current benchmarking research programs, they overlook a much more fundamental limitation of this literature: Disregarding the importance of action and interaction for perception.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 46 , 2023 , e405

DOI: https://doi.org/10.1017/S0140525X23001577 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Araslanov, N., Rothkopf, C. A., & Roth, S. (2019). Actor-critic instance segmentation. In L. Davis, P. Torr, & S.-Z. Zhu (Eds.), Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Long Beach, California, 16–20 June 2019 (pp. 8237–8246).CrossRef Google Scholar

Ayzenberg, V., & Behrmann, M. (2023). The where, what, and how of object recognition. Trends in Cognitive Sciences, 27, 335–336.CrossRef Google Scholar PubMed

Bremmer, F., Churan, J., & Lappe, M. (2017). Heading representations in primates are compressed by saccades. Nature Communications, 8, 920.CrossRef Google Scholar PubMed

Bremmer, F., & Krekelberg, B. (2003). Seeing and acting at the same time: Challenges for brain (and) research. Neuron, 38, 367–370.CrossRef Google Scholar PubMed

Dobs, K., Bülthoff, I., & Schultz, J. (2018). Use and usefulness of dynamic face stimuli for face perception studies – A review of behavioral findings and methodology. Frontiers in Psychology, 9, 1355.CrossRef Google Scholar PubMed

Dwivedi, K., Bonner, M. F., Cichy, R. M., & Roig, G. (2021). Unveiling functions of the visual cortex using task-specific deep neural networks. PLoS Computational Biology, 17(8), e1009267.CrossRef Google Scholar PubMed

Eckmann, S., Klimmasch, L., Shi, B. E., & Triesch, J. (2020). Active efficient coding explains the development of binocular vision and its failure in amblyopia. Proceedings of the National Academy of Sciences of the United States of America, 117(11), 6156–6162.CrossRef Google Scholar PubMed

Fiehler, K., Brenner, E., & Spering, M. (2019). Prediction in goal-directed action. Journal of Vision, 19(9), 10, 1–21.CrossRef Google Scholar PubMed

Fiehler, K., & Karimpur, H. (2023). Spatial coding for action across spatial scales. Nature Reviews Psychology, 2, 72–84.CrossRef Google Scholar

Jiahui, G., Feilong, M., di Oleggio Castello, M. V., Nastase, S. A., Haxby, J. V., & Gobbini, M. I. (2022). Modeling naturalistic face processing in humans with deep convolutional neural networks. bioRxiv, 1–39.Google Scholar

Kessler, F., Frankenstein, J., & Rothkopf, C. A. (2022). A dynamic Bayesian actor model explains endpoint variability in homing tasks. bioRxiv, 1–25.Google Scholar

Krakauer, J. W., Ghazanfar, A. A., Gomez-Marin, A., MacIver, M. A., & Poeppel, D. (2017). Neuroscience needs behavior: Correcting a reductionist bias. Neuron, 93, 480–490.CrossRef Google Scholar PubMed

Mineault, P. J., Bakhtiari, S., Richards, B. A., & Pack, C. C. (2021). Your head is there to move you around: Goal-driven models of the primate dorsal pathway. Advances in Neural Information Processing Systems, 34, 28757–28771.Google Scholar

Orhan, E., Gupta, V., & Lake, B. M. (2020). Self-supervised learning through the eyes of a child. Advances in Neural Information Processing Systems, 33, 9960–9971.Google Scholar

Roelfsema, P. R., & Holtmaat, A. (2018). Control of synaptic plasticity in deep cortical networks. Nature Reviews Neuroscience, 19, 166–180.CrossRef Google Scholar PubMed

Rothkopf, C. A., Weisswange, T. H., & Triesch, J. (2009). Learning independent causes in natural images explains the space variant oblique effect. In M. Amine, N. Enayati, & H. Li (Eds.), 2009 IEEE 8th international conference on development and learning, Shanghai, China, 5–7 June 2009 (pp. 1–6). IEEE.Google Scholar

Schmitt, C., Schwenk, J. C. B., Schütz, A., Churan, J., Kaminiarz, A., & Bremmer, F. (2021). Preattentive processing of visually guided self-motion in humans and monkeys. Progress in Neurobiology, 205, 102117.CrossRef Google Scholar PubMed

Schneider, F., Xu, X., Ernst, M. R., Yu, Z., & Triesch, J. (2021). Contrastive learning through time. In SVRHM 2021 Workshop@NeurIPS.Google Scholar

Straub, D., & Rothkopf, C. A. (2022). Putting perception into action with inverse optimal control for continuous psychophysics. eLife, 11, 76635.CrossRef Google Scholar PubMed

Wang, Z., Liu, L., Duan, Y., Kong, Y., & Tao, D. (2022). Continual learning with lifelong vision transformer. In R. Chellappa, J. Matas, L. Quan, & M. Shah (Eds.), Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, New Orleans, Louisiana, 19–24 June 2022 (pp. 171–181).CrossRef Google Scholar

Xu, X., & Triesch, J. (2023). CIPER: Combining invariant and equivariant representations using contrastive and predictive learning. http://arxiv.org/abs/2302.02330 CrossRef Google Scholar

Zhuang, C., Yan, S., Nayebi, A., Schrimpf, M., Frank, M. C., DiCarlo, J. J., & Yamins, D. L. (2021). Unsupervised neural network models of the ventral visual stream. Proceedings of the National Academy of Sciences of the United States of America, 118(3), e2014196118.CrossRef Google Scholar PubMed