Clark and Fischer (C&F) argue that social robots are depictions of human social agents. Importantly, their argument draws heavily upon western art in the mimetic tradition, where the primary purpose (and value) of art lies in how accurately an artwork imitates reality (Shimamura, Reference Shimamura2011). Social robots are conceptualised as interactive depictions of real humans and likened to actors in a play. C&F link the quality of a social robot to its resemblance to a human agent: The better the social robot impersonates a human agent, the more likely it is that people will interact with the robot in the same way.
Here, we argue that the analogy between social robots and mimetic art is flawed. This is because in many cases – including the examples provided by the authors – a social robot does not pretend to be a human agent, but instead participates in genuine social interactions, as a robot. Social robots are better likened to performance artists or dancers instead of actors; rather than depicting social interactions, they perform social interactions. This distinction between performance and depiction is important for better understanding and situating the scope and the limits of robots as social agents (Cross & Ramsey, Reference Cross and Ramsey2021).
Much of western contemporary art neither depicts nor represents. This is especially true for performance art. For example, in Marina Abramovic's famous performance installation “The artist is present” (Abramovic & Biesenbach, Reference Abramovic and Biesenbach2010), she invites visitors to sit down opposite her at a table in a gallery. Abramovic neither depicts a social interaction in this artwork – she genuinely meets other people – nor does she impersonate a character. The encounter is thus performed, but it is not depicted; depicted and depictive scenes are the same. Similarly, many contemporary choreographers and theatre makers create works without a linear narrative, storyline, or obvious characters (see Fig. 1 for an example). In fact, dissolving the binary distinction between depicted and depictive scenes, or acting and not-acting (Kirby, Reference Kirby1987) is an important aesthetic feature of contemporary theatre, dance, and performance art (Fischer-Lichte, Reference Fischer-Lichte2017; Lehmann, Reference Lehmann2005). The aesthetics of dance and performance do not necessarily depend on how realistically a character is impersonated, but on a performer's expressiveness (Christensen, Lambrechts, & Tsakiris, Reference Christensen, Lambrechts and Tsakiris2019), changes in the speed and acceleration of movement sequences (Orlandi, Cross, & Orgs, Reference Orlandi, Cross and Orgs2020), or movement synchrony among a group of performers (Cracco et al., Reference Cracco, Lee, van Belle, Quenon, Haggard, Rossion and Orgs2022; Vicary, Sperling, von Zimmermann, Richardson, & Orgs, Reference Vicary, Sperling, von Zimmermann, Richardson and Orgs2017). Much of contemporary performance art or non-narrative dance therefore lacks a clear separation between depicted and depictive scenes.
Figure 1 Performing without depicting. Seke Chimutengwende and Steph McMann in Detective Work (Reference Chimutengwende and McMann2021) Choreography by Seke Chimutengwende in collaboration with Steph McMann, commissioned by NEUROLIVE. Image by Hugo Glendinning.
C&F describe a similar example of performing without depicting: The robot “Smooth” offers a drink to Beth, who grabs the drink and thanks the robot. Beth responds to the robot naturally and intuitively, because – as in performance art – there is no distinction between depicted and depictive scenes. The robot performs a genuine social interaction: One physically embodied, social agent offers an object to another physically embodied, social agent. The robot therefore does not pose as a social agent, it is a genuine social agent.
In both performance art and in social interactions with robots, base scene and depictive scene are still present, yet this distinction is not specific to (or required for) engaging with performance art or social robots. People consist of bones, blood, organs, water, and so on, just as robots consist of metal and wiring. We can choose to interact with real people at different levels. For example, a surgeon spends most of her time working with the physical reality of the body, and not the person. Moreover, in many real-life social interactions people pretend, simulate, or act (Goffman, Reference Goffman1990). The distinction between three levels of depiction is thus not specific to robotic agents but equally applies to human agents.
Conceptualising social robots as depictions, therefore, does not help to explain in what way robots are similar or different to human social agents. Instead, we argue that social robots are better characterised by the properties of their social interactions, for example human-like movement kinematics or turn-taking behaviour. Importantly, the physical properties of an agent – for example, the extent which it resembles the human body, are arguably less important than the way it moves or interacts with the world around it (Cross et al., Reference Cross, Liepelt, de Hamilton, Parkinson, Ramsey, Stadler and Prinz2012; Ramsey & de Hamilton, Reference Ramsey and de Hamilton2010). Abstract shapes can produce vivid illusions of agency, expressivity, and social relationships, as first shown in the now-famous animations of Heider and Simmel (Reference Heider and Simmel1944), a finding that has been replicated, extended, and discussed extensively over the past half century (cf. Press, Reference Press2011).
In our own research, we have shown that movements that comply with the kinematics of human action are judged to be more natural and aesthetically pleasing than movements that violate human kinematics (Chamberlain et al., Reference Chamberlain, Berio, Mayer, Chana, Leymarie and Orgs2022). In the case of dance, greater predictability of movement kinematics increases aesthetic preference. A given sequence of dance movements is more appealing if the movements are performed with salient and rhythmic changes in speed and acceleration (Orlandi et al., Reference Orlandi, Cross and Orgs2020). Importantly, greater movement predictability also allows for smoother social interactions. For example, in cooperative tasks between two people, individuals reduce the variability of their movements to facilitate turn-taking (Vesper, van der Wel, Knoblich, & Sebanz, Reference Vesper, van der Wel, Knoblich and Sebanz2011).
In other words, we remain unconvinced that the separation between different levels of depiction is necessary or sufficient to explain why people engage socially with robots in some situations but not others. Levels of depiction do not explain why people engage with dance or performance art, because these levels do not necessarily exist for these art forms. Arguably, the interesting question is not what difference exists between real and depicted social agents, but instead: What constitutes an effective social interaction, no matter at what level of depiction it is performed?
Clark and Fischer (C&F) argue that social robots are depictions of human social agents. Importantly, their argument draws heavily upon western art in the mimetic tradition, where the primary purpose (and value) of art lies in how accurately an artwork imitates reality (Shimamura, Reference Shimamura2011). Social robots are conceptualised as interactive depictions of real humans and likened to actors in a play. C&F link the quality of a social robot to its resemblance to a human agent: The better the social robot impersonates a human agent, the more likely it is that people will interact with the robot in the same way.
Here, we argue that the analogy between social robots and mimetic art is flawed. This is because in many cases – including the examples provided by the authors – a social robot does not pretend to be a human agent, but instead participates in genuine social interactions, as a robot. Social robots are better likened to performance artists or dancers instead of actors; rather than depicting social interactions, they perform social interactions. This distinction between performance and depiction is important for better understanding and situating the scope and the limits of robots as social agents (Cross & Ramsey, Reference Cross and Ramsey2021).
Much of western contemporary art neither depicts nor represents. This is especially true for performance art. For example, in Marina Abramovic's famous performance installation “The artist is present” (Abramovic & Biesenbach, Reference Abramovic and Biesenbach2010), she invites visitors to sit down opposite her at a table in a gallery. Abramovic neither depicts a social interaction in this artwork – she genuinely meets other people – nor does she impersonate a character. The encounter is thus performed, but it is not depicted; depicted and depictive scenes are the same. Similarly, many contemporary choreographers and theatre makers create works without a linear narrative, storyline, or obvious characters (see Fig. 1 for an example). In fact, dissolving the binary distinction between depicted and depictive scenes, or acting and not-acting (Kirby, Reference Kirby1987) is an important aesthetic feature of contemporary theatre, dance, and performance art (Fischer-Lichte, Reference Fischer-Lichte2017; Lehmann, Reference Lehmann2005). The aesthetics of dance and performance do not necessarily depend on how realistically a character is impersonated, but on a performer's expressiveness (Christensen, Lambrechts, & Tsakiris, Reference Christensen, Lambrechts and Tsakiris2019), changes in the speed and acceleration of movement sequences (Orlandi, Cross, & Orgs, Reference Orlandi, Cross and Orgs2020), or movement synchrony among a group of performers (Cracco et al., Reference Cracco, Lee, van Belle, Quenon, Haggard, Rossion and Orgs2022; Vicary, Sperling, von Zimmermann, Richardson, & Orgs, Reference Vicary, Sperling, von Zimmermann, Richardson and Orgs2017). Much of contemporary performance art or non-narrative dance therefore lacks a clear separation between depicted and depictive scenes.
Figure 1 Performing without depicting. Seke Chimutengwende and Steph McMann in Detective Work (Reference Chimutengwende and McMann2021) Choreography by Seke Chimutengwende in collaboration with Steph McMann, commissioned by NEUROLIVE. Image by Hugo Glendinning.
C&F describe a similar example of performing without depicting: The robot “Smooth” offers a drink to Beth, who grabs the drink and thanks the robot. Beth responds to the robot naturally and intuitively, because – as in performance art – there is no distinction between depicted and depictive scenes. The robot performs a genuine social interaction: One physically embodied, social agent offers an object to another physically embodied, social agent. The robot therefore does not pose as a social agent, it is a genuine social agent.
In both performance art and in social interactions with robots, base scene and depictive scene are still present, yet this distinction is not specific to (or required for) engaging with performance art or social robots. People consist of bones, blood, organs, water, and so on, just as robots consist of metal and wiring. We can choose to interact with real people at different levels. For example, a surgeon spends most of her time working with the physical reality of the body, and not the person. Moreover, in many real-life social interactions people pretend, simulate, or act (Goffman, Reference Goffman1990). The distinction between three levels of depiction is thus not specific to robotic agents but equally applies to human agents.
Conceptualising social robots as depictions, therefore, does not help to explain in what way robots are similar or different to human social agents. Instead, we argue that social robots are better characterised by the properties of their social interactions, for example human-like movement kinematics or turn-taking behaviour. Importantly, the physical properties of an agent – for example, the extent which it resembles the human body, are arguably less important than the way it moves or interacts with the world around it (Cross et al., Reference Cross, Liepelt, de Hamilton, Parkinson, Ramsey, Stadler and Prinz2012; Ramsey & de Hamilton, Reference Ramsey and de Hamilton2010). Abstract shapes can produce vivid illusions of agency, expressivity, and social relationships, as first shown in the now-famous animations of Heider and Simmel (Reference Heider and Simmel1944), a finding that has been replicated, extended, and discussed extensively over the past half century (cf. Press, Reference Press2011).
In our own research, we have shown that movements that comply with the kinematics of human action are judged to be more natural and aesthetically pleasing than movements that violate human kinematics (Chamberlain et al., Reference Chamberlain, Berio, Mayer, Chana, Leymarie and Orgs2022). In the case of dance, greater predictability of movement kinematics increases aesthetic preference. A given sequence of dance movements is more appealing if the movements are performed with salient and rhythmic changes in speed and acceleration (Orlandi et al., Reference Orlandi, Cross and Orgs2020). Importantly, greater movement predictability also allows for smoother social interactions. For example, in cooperative tasks between two people, individuals reduce the variability of their movements to facilitate turn-taking (Vesper, van der Wel, Knoblich, & Sebanz, Reference Vesper, van der Wel, Knoblich and Sebanz2011).
In other words, we remain unconvinced that the separation between different levels of depiction is necessary or sufficient to explain why people engage socially with robots in some situations but not others. Levels of depiction do not explain why people engage with dance or performance art, because these levels do not necessarily exist for these art forms. Arguably, the interesting question is not what difference exists between real and depicted social agents, but instead: What constitutes an effective social interaction, no matter at what level of depiction it is performed?
Financial support
GO and ESC received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreements No. 864420 – Neurolive and No. 677270 – Social Robots). ESC also gratefully acknowledges funding from the Leverhulme Trust (PLP-2018-152).
Competing interest
None.