Emotional modulation of peripersonal space impacts the way robots interact

Peripersonal space refers to the area around the body that is perceived as secure and reachable. The ability to build such a representation is necessary in both approach and avoidance behaviors. Several studies show that the perception of reachable and comfort areas depends on emotions. In this paper, we describe how we model an appetitive and an aversive pathway based on the role of some brain regions. The obtained emotional states modulate the robot perception of its peripersonal space. This representation is directly used to control the robot behavior. Based on a single-resource multirobot experiment, we show the impact of such an emotional modulation. Aggressive or fearful behaviors emerge from the dynamics of interaction between the simulated robots.


Introduction
Peripersonal space (PPS) refers to a multimodal sensorimotor interface between the body and its environment (Rizzolatti et al., 1997).It is the area around individuals in which an external intrusion can be perceived as possibly threatening, or at least uncomfortable (Kennedy et al., 2009) (Tajadura-Jiménez et al., 2011).In addition, it also determines the reachable space in action-related contexts (Valdés-Conroy et al., 2012).Thus, PPS perception is by definition relevant in both approach and avoidance behaviors.The parietal cortex is thought to play a key role in such a multimodal representation of the surrounding (Rizzolatti et al., 1997) (Graziano and Cooke, 2006).
It has been demonstrated that PPS representation is plastic.Indeed, positively valenced objects tend to be perceived as more reachable than negative ones (Valdés-Conroy et al., 2012).However, the presence of threatening objects in our peripersonal area can be perceived differently.For instance, a knife seems farther when oriented toward us, i.e. when potentially dangerous (Coello et al., 2012).On the other hand, a positive affective state, induced by pleasant music for instance, can impact the PPS as well, reducing the area needed to feel confortable in over-crowded spaces (Tajadura-Jiménez et al., 2011).
In this work, we model an appetitive and an aversive pathway based on the idea that, in biological organisms, ba-sic motivated behavior can be represented in terms of approach and avoidance.Following a constructivist approach, we mimic the role of brain regions involved in the emotion circuitry.Our aim is to have the minimal set of signals required to model the robot emotional state.
Various approaches for modeling emotions in robots and artificial agents can be found in the literature.They generally rely on a set of drives to guide agents behavior (Cañamero, 1997) (Hirth et al., 2007).Recent work also proposed models based on hormonal modulation (Lones et al., 2014) and neuromodulatory signals (Krichmar, 2013).Moreover, in some cases, emotions are considered as a way to implement metacontrollers and self-regulation loops (Sanz et al., 2013) (Jauffret et al., 2013).
Here, we want to model PPS as a representation of the surrounding area that is both secure and reachable.This representation should be modulated by the robot emotional states in order to integrate a subjective and motivated perception of its environment.More specifically, we address the cases where approach and avoidance motivations are in contradiction.Thereby, we can observe the impact of the emotional state on the robot behavior.We show the effect of PPS emotional modulation on the interaction between two robots in a single-resource situation.We also observe that their behavior expresses aspects of their internal states.Depending on whether lateral inhibition between appetitive and aversive pathways is allowed, the robots seem aggressive/determined or fearful/patient.
In the next sections, we present our model for the emotional modulation of peripersonal space and discuss results from multirobot competition simulations.

Modeling appetitive and aversive pathways
Pleasure and pain are considered as basic components of emotions (Damasio, 2003).The dopaminergic pathways are generally associated to pleasure and reward prediction in the literature (Berridge, 2012).The Ventral Tegmental Area (VTA) contains the largest group of dopamine neurons and projects to limbic structures like the amygdala (mesolimbic pathway) and prefrontal cortex (mesocortical pathway).On the other hand, nociceptors signals are transmitted from the spinal cord.In the neural processing of pain and punishment aversion, serotonine, which is mainly produced in the Raphe Nuclei, also plays a significant role (Cools et al., 2008).In this paper, we do not aim at a detailed model of the pleasure and pain neural circuitries.However, we are interested in integrating reward and punishment signals and modeling interactions between dopaminergic and serotonergic pathways to inhibit either appetitive or aversive behaviors.
Motivations are also a key component of emotions (Damasio, 2003).Some theorists see the latter as an expression of motivational states that prepares for action and triggers cognitive control (Pessoa, 2008) (Michael Inzlicht and Bruce D. Bartholow and Jacob B. Hirsh, 2015).Here, we are interested in modeling low-level appetitive and aversive drives.For example, the Hypothalamus (HTH) links the nervous system to the endocrine system.Thereby, it intervenes in various bodily functions such as monitoring physiological parameters and regulating hunger and thurst.Moreover, one of the function of the superior colliculus (SC) is the integration of multiple sensory input in order to trigger defensive behavior like avoidance or withdrawal (Comoli et al., 2012).
Inputs from hypothalamus and sensory information are relayed from thalamus to amygdala.The latter plays a key role in emotions.It responds with higher activations in the presence of arousing stimuli and projects to the Reticular Formation (RF), which is thought to modulate the arousal level of the central nervous systems (Cardinal et al., 2002).In addition, (Kennedy et al., 2009) suggests the amygdala is necessary for PPS representation.Indeed, it is required for the attribution of positive or negative values to stimuli through stimulus-stimulus (S-S) Pavlovian conditioning.It also allows for stimulus-response (S-R) Pavlovian conditioning (Cardinal et al., 2002).In this work, we only describe reflex pathways.However, our model is consistent with the idea that the amygdala participates in building the robot emotional and motivational states through conditioning.For instance, unconditional stimuli can be associated with reward or punishment signals.Such predictions would trigger or emphasize approach or avoidance behavior like in the incentive motivation literature (Berridge, 2012).Figure 1 summarizes the way we model appetitive and aversive pathways in order the modulate robot PPS perception.Please note that we do not aim to present a precise model of the brain structures involved.We rather mimic some of their functions in order to propose a bio-inspired model that is consistent with the literature.
In this work, we model two basic low-level motivations: the feeding drive (appetitive) and the safety drive (aversive).In the latter, we calculate the mean activity on the n s proximity sensors s i to obtain the level th of threat at time t: Figure 2: Comparison between a direct perception of the level of a physiological variable (PhV) and the HTH model (30 feeding cycles each).We suppose the PhV increases and decreases linearly in our system.The results shows how using the HTH model allows for anticipating the lack of food and prevents from depletion.
The feeding drive is guided by the preceived level of a simulated physiological variable, based on the model of hypothalamus proposed in (Hasson, 2011).The level r of the physiological variable associated to the resource at time t is: where r max is the maximal variable level (set to 1), α r and β r respectively indicate the ingestion and the consumption speed factors and I is the ingestion signal.Using this HTH model gives the robot the ability to anticipate the lack of food in order to trigger the appropriate behavior.For the sake of simplicity, let us consider the level of a physiological variable (PhV) increases and decreases linearly.Considering both functions reach the maxima simultaneously, with the HTH model, the perceived PhV level drops below the satisfaction level more quickly in the consumption phases.Figure 2 shows the result of such a comparison between a direct perception of the PhV level and the HTH model.
We define the approach m ap and avoidance m av motiva- tion levels at time t as following: where ε m and γ m respectively represent the integration and inhibition factors of the competition.
Similarly, we obtain a medium-term affective state a that integrates punishment a pn and reward a rw signals at time t using the following equations : where ε a and γ a respectively represent the integration and inhibition factors of the competition.

Proposed model for emotional modulation of peripersonal space
As a sensorimotor interface with the world related to both approach and avoidance behaviors, we suggest it is interesting to model PPS in a robotic system.Here, we are more precisely interested in its modulation by emotional states.
If we consider a mobile robot in a navigation task, we can represent various states of its PPS perception like in Fig. 3. Indeed, its comfort zone can contract or dilate according to the pleasantness of the current affective state.Also, appetitive and aversive stimuli respectively induce an extension or a retraction of the reachable space in the corresponding direction.
We propose that peripersonal space perception is based on a working memory that integrates sensorimotor information (See Fig. 4).For instance, the robot can remember the position of an obstacle it avoided.Also, it can update a path integration vector associated with a goal according to the speed and direction of instantaneous movement.We propose this sensorimotor input has to be integrated according to the current affective state.Thus, if the robot perceives a collision as a punishment signal, obstacles become more salient and leave a bigger trace in the working memory.Indeed, Figure 4: Model for building a representation of the robot peripersonal space.It is based on working memory taking input from various sensory modalities.PPS is modulated by the robot emotional states in order to integrate a subjective and motivated perception of its environment.punishment-induced negative state expands the robot comfort zone, i.e. the space in which intrusions seems threatening.
In this paper, we do not focus on the building of the working memory.It is based on the principle described in (Hasson and Gaussier, 2010).The robot can learn to associate several goal locations to the drives they satisfy (e.g.hunger and thurst).Proprioceptive path integration fields allows it to return to the resource locations when needed.However, please note that the working memory has limited capacity.Yet the model is able to handle multiple goals by replacing the least used memory field when new resources are discovered.
Information from the working memory can be merged to offer a representation of the robot peripersonal space.However, this perception highly depends on the motivational state.For example, an appetitve drive make a desirable object seem more reachable.Likewise, a defensive motivation highlights aversive stimuli in the comfort zone and generates an avoidance behavior.Therefore, we suggest a second emotional modulation occurs in order to filter information coming from the working memory.This motivated perception is directly used to determine robot actions.

Single-resource multirobot competition Implementation details
This experiment is performed on the Webots simulator in order to avoid damaging real robots.We simulate two identical robots moving in a 17.5 m x 15 m environment.The 4-wheel mobile robot platforms are 40 cm-wide, 50 cm-long and 1 m-high.They are embedded with light sensors.The latter are placed under the robots to detect a 45 cm x 45 cm colored zone in the center which is considered as a resource.The platform also has 9 ultra-sound proximity sensors, of which we only use a subset to cover a 180 degrees-wide front field.
The robots affective states depend on the received punishment and rewards signals used to simulate pain and pleasure (See Equation 4).In this experiment, they are respectively given by collision and resource detection.In addition, robots have one appetitive and one aversive drive -respectively feeding and protecting its own physical integrity (See Equations 2 1).
Starting from the resource location, the return vector is calculated by integrating the speed and direction of instantaneous movements.The activity p of each neuron i in the path integration field at time t is given by the following equation: where t r is the last reset time, s the linear speed, d the direction, R the reset signal and n the size of the neural field.
The feeding drive becomes active whenever the level of the physiological variables drops below a satisfaction threshold s th .The robot then uses the path integration vector to return to the resource.Similarly, obstacle detection triggers the defensive drive and generates an avoidance behavior.In some case, these two low-level motivations can be contradictory, e.g. if there is an obstacle (object or other robot) on the way.A competition between the appetitive and aversive drives allows them to inhibit each other, which favours the approach or the avoidance behavior depending on the drives levels (See Equations 4).
We use a dynamic neural field (DNF) (Schöner et al., 1995) to merge reachable space and comfort zone information.Appetitive and aversive stimuli given as input generate attractors and repulsors in the DNF.The potential u of each neuron x is updated as following: where f (x) = tanh(x) and is used to calculate the neuron activity, τ is the time constant, I the input, h a constant inhibition potential and w an interaction kernel.Using a DoG (Difference of Gaussian) function as the interaction kernel allows proximal stimuli to reinforce each other and to inhibit distant ones.The highest neuron activity is used to calculate the linear speed.Also, a readout of the output derived signal (according to the current orientation) allows for computing the rotational speed.

Method
In order to study the impact of the emotional modulation of the peripersonal space, we compare our model behavior with two altered versions of the architecture: • version We use the proposed model described in the previous section.
• NoCompet version In this version, we still modulate robot PPS according to its emotional state.However, there is no lateral inhibition between punishment and reward signals nor between appetitive and aversive drives.
• NoModul version: In this version, no modulation of approach/avoidance is performed at all.Robot drives only serve for triggering homing behavior for example.This version is the closest to a classical reactive architecture.Except here approach and avoidance have the same weight in the DNF.
We define a cycle as an interval in which a robot, initially satisfied (non-hungry), consumes the energy obtained from the previous ingestion and returns to the resource in order to feed once again.Each of these cycles is considered as an independant sample of the multirobot competition for the resource.Once the feeding drive satisfied, robots get away from the resource.They randomly navigate in the environment updating their path integration field to be able to return to the resource when needed.We use three measures: • min phyvar: Lowest level of the physiological variable associated with the feeding drive, • nb own access: Number of own accesses to the resource within a full cycle, • nb other access: Number of other robot accesses to the resource within own cycle.
The first one is a measure of food depletion, i.e. how close to starvation the robots get.The two latter quantify cycle interruptions.Besides, we consider two variables: • model: Whether our model is used or not (the NoCompet and NoModul version are gathered in the same group), • version: Which version is used (each of the 3 versions is association with a group).There is also a main effect of version on nb other access and bin(nb other access).
Kruskal-Wallis non-parametric test shows that there is no effect of version nor model on min phyvar (resp.Chi 2 = 0.45, p = 0.80 ; and Chi 2 = 0.00, p = 0.95).No significant effect of version on nb own access was found either (Chi 2 = 5.36, p = 0.07).However, there is a strong tendency with model (Chi 2 = 3.81, p = 0.05).Also, there is a main effect of both model and version on nb other access (resp.Chi 2 = 6.03, p = 0.01; and Chi 2 = 8.19, p = 0.02).In addition, Mann-whitney test shows a significant effect of version on nb other access between Model and NoCompet (U = 82, p < 0.05).
Moreover, let us consider two additional measures bin(nb own access) and bin(nb other access) respectively equal to 1 if nb own access and nb other access are greater than 1, and 0 otherwise.Indeed, in the case of a perfect alternation of the robots over the resource, each should access it exclusively and only once in every cycle.Any different configuration could correspond to a feeding cycle being interrupted by another robot.In this case, we find a strong tendency on bin(nb own access) with both model and version (resp.Chi 2 = 3.68, p = 0.05; and Chi 2 = 6.01, p = 0.05) as well as a significant effect on bin(nb other access) (resp.Chi 2 = 5.19, p = 0.02; and Chi 2 = 6.73, p = 0.03).Behavioral comparison When the NoCompet or NoModul architectures are used, the robots tend to be unable to access the resource before it is free -i.e.before the other robot is done feeding.However, with the Model version, they can push one another and compete for the resource.This is due to the approach sub-behavior inhibiting the defensive one.
Figure 6 shows arousing areas as well as positively and negatively valenced ones in the environment.We notice that in the NoCompet case, the area around the resource is one of the most arousing because both drives are simultaneously active.The robots generally need to feed but avoid collisions with the other one currently on the resource.Also, rewards are only obtained right on the center of the resource and punishments around it.However, with the Model and NoModul architectures, the emotional states are generated exactly in the same way even though, in the latter, they do not mod-ulate robots PPS; and thereby their behavior.We see that most arousing areas are less localized than in the NoCompet case.Indeed, the competition between approach and aversion makes them inhibit each other and avoids arousal saturation.But, with the Model version, negative valence reaches a lower level than with the two other architectures.This is due to the robot tendency to avoid collisions less than with NoCompet and NoModul.

General discussion
In this experiment, we showed how the emotional modulation of robots PPS impacted their behavior and the way they interacted.We compared our Model with two altered versions of the architecture.Statistical results regarding the min phyvar measure revealed no significant difference between the architectures in terms of food depletion.This is due to the random exploration following feeding phases and to resource consumption being slower that its ingestion.This leaves the possibility for the robot to alternate in resource access.Yet, it is interesting to observe how this alternation occurs.That is to say, how the robots interact in this survival task.
The measures nb own access and nb other access allow for caracterizing this alternation in resource access.Indeed, if robots wait for each other, each one has access to the resource exclusively and only once in every cycle.If greater than 1, they indicate that a feeding phase was interrupted.The results show a strong tendency with the variable model on nb own access and with both model and version on bin(nb own access).They also show a significant effect of both variables on nb other access and bin(nb other access).
Cycle interruption are directly related to robots being pushed by each other far from the resource.When the No-Compet or NoModul architectures are used, robots tend to be unable to access the resource before it is free.In the NoModul case, accidental collisions may occur but robots generally deviate from the resource in order to avoid the other one which is currently feeding.On the other hand, interactions between the robots seem to carry a social significance with NoCompet and Model versions.In both cases, they behave in a way that expresses aspects of their internal states.With the former, robots seem either patient or fearful.Their modulation of their PPS make them extend their comfort zone.They are more sensitive to aversive stimuli and defensive sub-behavior tend to take over the appetitive one.On the contrary, using the Model version, the robots seems more proactive and determined.When the resource is not available they try to push whatever is on their way.Thereby, they display an aggressive behavior.
Similar behaviors are observed in (Lones et al., 2014).A hormone-based model is tested in competitive and noncompetitive environments.In the former case, aggressive or withdrawn populations of agents can emerge from an epigenetic adaptation mechanism.In the literature, a di-chotomy of aggression opposes the proactive forms to the reactive ones (Weinshenker and Siegel, 2002) (Vitiello and Stoff, 1997).In the first class, a predatory attack serves as a way to get a reward.It is instrumental and is generally accompanied by a very low level of sympathetic arousal.This kind of behavior is not modeled here, although it can be observed due to the dynamics of interaction between the robots.In contrast, affective defense describes any aggressive response triggered by elements of fear and/or threat.It is reactive, rather than proactive or instrumental.Also, it is more related to anger, which is represented as a negatively valenced affect with relatively high arousal (or intensity) in dimensional models of emotion (Posner et al., 2005).The emotional state of our robot during aggressive episodes is consistent with this representation.
Unlike (Lones et al., 2014), in our work, the emergence of aggressive vs. fearful behavior depends on a lateral inhibition between appetitive and aversive motivations being allowed or not in the architecture.Although we take inspiration from interactions between dopaminergic and serotonergic pathways, the competition between approach and avoidance implemented here is relatively basic.Indeed, in some context, such a winner-takes-all mechanisms would be inefficient or irrelevant.(Krichmar, 2013) highlights the role of cholinergic and noradrenergic systems in regulating the dopamine and serotonin levels.Modeling such topdown neuromodulatory mechanisms could be an interesting option.
Besides, (Kennedy et al., 2009) shows that individuals with bilateral amygdala lesions fail to represent peripersonal space boundaries correctly.The authors suggest this is due to the absence of strong emotional responses to personal space violation.Indeed, the capacity to associate stimuli to reward or punishment signal seems necessary.(Berridge, 2012) also highlights the role of Pavlovian conditioning in reward prediction and incentive motivation.In our case for instance, adding instrumental learning mechanisms would allow our robots to switch from purely reactive affective responses to goal-oriented aggression.
One particularity of our model is the idea that PPS representation is based on a subjective perception.The size of robot comfort zone is modulated by its affective state (Tajadura-Jiménez et al., 2011).This accentuates the fearful behavior for example.One could argue it is very specific to the defensive mechanisms.Yet, its perception of its elementary displacements could also be modulated likewise.In most cases where feeding only depends on the ability to return to a resource like here, this would lead path integration to failure.However, future work will investigate situations where such an erroneous perception can be useful.

Conclusion
In conclusion, we propose a model allowing the robot to build a representation of its peripersonal space based on a subjective and motivated perception.Based on the role of certain brain structures, we indentify a set of signals required to model the robot emotional state.Thus, in our model, signals of pleasure vs. pain and approach vs. avoidance interact and modulate the robot perception.By simulating a simplified version of biological behaviors, we are able to observe the impact of some alterations on the model.We show how the emotional modulation of the peripersonal space makes the robots interact in a way that expresses aspects of their internal states.In addition, depending on whether lateral inhibition between appetitive and aversive pathways is allowed, the robots seem more aggressive or more fearful.

Figure 1 :
Figure 1: Brain regions involved in our model.The appetitive and aversive pathways are respectively represented in green and red.The yellow arrow illustrates the emotional modulation of the peripersonal space perception.

Figure 3 :
Figure 3: Different forms of modulation of robot PPS.FAR-LEFT: No modulation.LEFT and CENTER-LEFT: The comfort zone contracts or dilates according to the pleasantness of the affective state.CENTER-RIGHT, RIGHT, and FAR-RIGHT: Also, appetitive and aversive stimuli respectively induce an extension or a retraction of the reachable space in the corresponding direction.

Figure 5 :
Figure 5: Statistical significance of the effect of different architecture versions on the considered measures.No effect has been revealed on food depletion (min phyvar).But, a strong tendency is found with model variable on nb own access and with version on bin(nb own access).There is also a main effect of version on nb other access and bin(nb other access).

Figure 6 :
Figure 6: Heatmaps representing arousing (left column) and positively and negatively valenced (right column) areas in the environment.They are averaged for both robots.The brackets on the colorbars show the intervals between min and max values.The lines respectively correspond to Model, NoCompet and NoModul from top to bottom.