Hostname: page-component-7479d7b7d-qlrfm Total loading time: 0 Render date: 2024-07-11T23:08:54.409Z Has data issue: false hasContentIssue false

Neither hype nor gloom do DNNs justice

Published online by Cambridge University Press:  06 December 2023

Felix A. Wichmann
Affiliation:
Neural Information Processing Group, University of Tübingen, Tübingen, Germany felix.wichmann@tuebingen.de
Simon Kornblith
Affiliation:
Google Research, Brain Team, Toronto, ON, Canada skornblith@google.com geirhos@google.com
Robert Geirhos
Affiliation:
Google Research, Brain Team, Toronto, ON, Canada skornblith@google.com geirhos@google.com

Abstract

Neither the hype exemplified in some exaggerated claims about deep neural networks (DNNs), nor the gloom expressed by Bowers et al. do DNNs as models in vision science justice: DNNs rapidly evolve, and today's limitations are often tomorrow's successes. In addition, providing explanations as well as prediction and image-computability are model desiderata; one should not be favoured at the expense of the other.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Baker, N., Lu, H., Erlikhman, G., & Kellman, P. J. (2018). Deep convolutional networks do not classify based on global object shape. PLoS Computational Biology, 14(12), e1006613.CrossRefGoogle Scholar
Biscione, V., & Bowers, J. S. (2023). Mixed evidence for Gestalt grouping in deep neural networks. Computational Brain & Behavior, 119. https://doi.org/10.1007/s42113-023-00169-2Google Scholar
Dehghani, M., Djolonga, J., Mustafa, B., Padlewski, P., Heek, J., Gilmer, J., … Houlsby, N. (2023). Scaling vision transformers to 22 billion parameters. arXiv, arXiv:2302.05442.Google Scholar
DiCarlo, J. J., Zoccolan, D., & Rust, N. C. (2012). How does the brain solve visual object recognition? Neuron, 73(3), 415434.CrossRefGoogle ScholarPubMed
Doerig, A., Bornet, A., Choung, O. H., & Herzog, M. H. (2020). Crowding reveals fundamental differences in local vs. global processing in humans and machines. Vision Research, 167, 3945.CrossRefGoogle ScholarPubMed
Geirhos, R., Narayanappa, K., Mitzkus, B., Thieringer, T., Bethge, M., Wichmann, F. A., & Brendel, W. (2021). Partial success in closing the gap between human and machine vision. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P. S., & Wortman Vaughan, J. (Eds.), Advances in neural information processing systems (Vol. 34, pp. 2388523899). Curran Associates.Google Scholar
German, J. S., & Jacobs, R. A. (2020). Can machine learning account for human visual object shape similarity judgments? Vision Research, 167, 8799.CrossRefGoogle ScholarPubMed
Intemann, K. (2020). Understanding the problem of “hype”: Exaggeration, values, and trust in science. Canadian Journal of Philosophy, 52(3), 116.Google Scholar
Lake, B. M., Ullman, T. D., Tenenbaum, J. B., & Gershman, S. J. (2017). Building machines that learn and think like people. Behavioral and Brain Sciences, 40, e253.CrossRefGoogle ScholarPubMed
Mao, J., Gan, C., Kohli, P., Tenenbaum, J. B., & Wu, J. (2019). The neuro-symbolic concept learner: Interpreting scenes, words, and sentences from natural supervision. In International Conference on Learning Representations (ICLR), New Orleans, Louisiana, United States, pp. 1–28. https://iclr.cc/Conferences/2019Google Scholar
Montero, M. L., Bowers, J. S., Costa, R. P., Ludwig, C. J. H., & Malhotra, G. (2022). Lost in latent space: Disentangled models and the challenge of combinatorial generalisation. arXiv, arXiv:2204.02283.Google Scholar
Pearl, J. (2009). Causality (2nd ed.). Cambridge University Press.CrossRefGoogle Scholar
Rao, R. P. N., & Ballard, D. H. (1999). Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nature Neuroscience, 2(1), 7987.CrossRefGoogle ScholarPubMed
Wichmann, F. A., & Geirhos, R. (2023). Are deep neural networks adequate behavioural models of human visual perception? Annual Review of Vision Science, 9. https://doi.org/10.1146/annurev-vision-120522-031739CrossRefGoogle ScholarPubMed