robots

Using a rule-based system for growing artificial neural networks, we have evolved controllers for physically simulated robotic "spiders". The controllers take their input from an “artificial retina” that senses other spiders and inanimate barrier objects in the environment, and must provide output to dynamically control the 18 degrees of freedom of the six legs of the robot every time step. We perform evolutionary runs with two species of spider that interact in simulation with each other and with inanimate barrier objects. One species (the "predator") is selectively rewarded for "eating" (by physically colliding with) the other species, and the other (the "prey") is selectively penalized for being caught, and rewarded for "eating" the barriers. The two species evolve complex running gaits, with control inputs coming from their retinas that produce hunting or avoidance behavior. We suggest that predator-prey frequency dependent selection can provide a relatively long-term genetic memory of previously searched regions of phenotype space, enforcing a form of novelty search that may reduce duplicated evolutionary search effort.


Introduction
One of the primary goals in the field of artificial developmental systems is to evolve systems that show an open-ended increase in complexity over evolutionary time."Generative" artificial developmental systems (Boers and Kuiper, 1992;Jacob and Rehder, 1993;Gruau, 1994;Hornby and Pollack, 2001) are intended to provide the possibility of such open-ended increase, whereas, for example, a standard genetic algorithm with a fixed genome length, and fixed phenotypic meanings of all genetic loci, does not.We have previously (Palmer, 2011) made the observation that complexity is not necessarily selectively favored over simplicity: even though multicellular organisms exist, bacteria, in their relative simplicity, still make a fine living.Nonetheless, in natural history, it has apparently been the case that more complex organisms can sometimes do things that simpler ones cannot, and thereby outcompete them.If we can set up such situations, then we may be able to drive the evolution of complexity in silico.Gould (1994) has argued that the complexity of life could be due to drift; nonetheless, if we can create indirect selective pressure for it, complexity will increase much more rapidly than it would by drift alone.
Unfortunately, starting with primitive artificial organisms and immediately selecting for complex functions (such as complex cognition) typically results in uniformly low fitness for all of the organisms.Long-term evolutionary progress requires a fitness landscape that offers selective "hints", on some timescale, pointing in the direction of greater fitness.These hints need not always be present, or even be fully consistent, but on evolutionary landscapes that sometimes reveal biases in their broader structure, an evolutionary algorithm can make long-term progress, rather than becoming trapped in local optima for long periods of time.This suggests that we can increase the probability of long-term evolutionary progress by creating a series of selective "stepping-stones" of increasing difficulty (for example, selecting for success at increasingly "difficult" cognitive tasks).This is also known as "incremental evolution" (Winkeler and Manjunath, 1998) or "scaffolding" (Bongard, 2011).We might manually design a series of such stepping-stones on simple problems, but another long-term goal of artificial evolution is to solve problems that we do not know how to solve manually (or that are too expensive to solve manually).Our previous work on the "L-Brain" model (Palmer, 2011) defined a new generative method for "growing" neural networks via an artificial developmental process.Therefore we sought a means to automatically generate a series of increasingly difficult selective challenges, in an attempt to drive selection for complexity in such "grown" networks.
Predator-prey interactions have long been thought to promote adaptive evolution in artificial systems (Koza, 1991;Sims, 1994).A "Red Queen's race" between predator and prey may create an arms race of adaptation between them.A body of work by Nolfi et al. (Floreano and Nolfi, 1997;Nolfi and Floreano, 1998) and related work by Buason et al. (Buason and Ziemke, 2003;Buason et al., 2005) explored coevolution of robot predators and prey, using a simulated version of a hardware robot.Our work in this paper differs in that we use a generative method for "growing" our neural network controllers, rather than evolving network weights only; however, some of the work by Buason (2003) involved the evolution of the robot bodies as well as their brains, which we do not discuss in this paper.
Whereas competitive co-evolution does sometimes produce interesting innovations, it can also, like single-species evolution, become stagnant.Alternatively, it can enter repeating "rock-paper-scissors"-like cycles, where predators and prey repetitively cycle through the same finite set of strategies to pursue, or evade, one another.Nolfi (2012) discusses these apparent obstacles to open-ended evolution in a review article, identifying some characteristics that promote open-ended evolution; his answer can be generalized to say that "richness" of the evolutionary possibilities is important: he suggests that body-brain co-evolution contributes to this richness; as does "the ability of agents to adapt ontogenetically" or during their lifetimes, for example with memory; as does richness of the task and environment.It is our belief that generative developmental systems, as opposed to fixed genetic representations, may also add to the richness of evolutionary possibilities.In general, the larger the number of evolutionary avenues open to the combined co-evolutionary process of all the species, the less likely this process is to repeatedly tread the same evolutionary pathways over and over.(Clearly, the natural environment on Earth is vast and widely varying, especially if one includes the co-evolution of great numbers of competing species; so it does possess this qualitative "richness".)In the work described here, we include four possible sources of evolutionary richness: 1) a generative developmental system for growing our neural network controllers, 2) a complex simulated robot body (which, herein, does not evolve, however), 3) two co-evolving species, a predator and prey, and finally, 4) we divide our population into a metapopulation of many local demes, with occasional migration between them, such that behaviors unique to a local "race" of one species might evolve in a subset of demes.

Robot Bodies and Actuation by Controllers
We use a fixed hexapod "spider" robot body (three of which are visible in Figure 1).Each of the six legs has three degrees of freedom (DOF): from the center of the head, looking outward along one of the upper leg segments, the "thigh" segment can move left-right and up-down.The attached "foreleg" segment can move up-down only.Neither joint can twist.Thus, in total, each robot has thirteen rigid body "parts", and twelve joints with 18 total DOF, all of which are actuated.Each joint axis has fixed limits to its range of motion.The neural network controlling each robot body has one Output neuron for each degree of freedom; when it assumes a value of +1, it is calling for its corresponding joint axis to be at its maximum range limit; a value of -1 calls for the minimum range limit.A simulated "spring" between the actual and requested positions generates a force on the joint axis.
We used similar robot bodies in (Palmer, 2011), but here we have eliminated the body orientation and velocity sensor Inputs, and replaced them with an "artificial visual cortex".

Artificial Visual Cortex
An artificial "visual cortex" allows the spiders to "see" other objects of three types along six lines of sight.For example, in Figure 1, the green (which indicates the prey species) spider at lower right sees three objects situated around it: one spider of the same (green) species, one spider of the other (purple, indicating the predator) species, and one barrier object (cylinder).The lines of sight radiate from the spider at 60 degree angles; objects falling in a ~60 degree arc, centered on each line of sight, will register on the artificial visual cortex.
The 18 Input neurons of the visual cortex are arranged in three rows of six, as shown in Figure 2. A particular neuron activates when an object of its type is in a particular 60-degree arc (centered on one of the lines of sight): the rows encode the object type, and the columns encode the viewing direction.In Figure 2, three neurons in the visual cortex are activated (larger size), indicating the presence of one spider of the same species to the front left ("same2"), one spider of the other species to the front right ("other3"), and one barrier object to the left ("barrier1"), as they were situated in Figure 1.The neurons activate more strongly for closer objects.

Growth of Neural Networks
We use the L-Brain method (Palmer, 2011) for "growing" neural networks according to inherited sets of growth rules.In the L-Brain method, a neural network unfolds in three dimensions according to cell division rules comprising: 1) a predicate type, 2) a conditional expression that indicates when and where the rule may be applied, and 3) two successor  types.Beginning from a single protoneuron of a certain type (indicated by the light blue color of the sphere in the top left panel of Figure 3), the rule set is repeatedly searched for applicable rules.If the predicate of a rule matches the type of a protoneuron, and the conditional expression evaluates to true, then the protoneuron divides into two protoneurons, each with one of the successor types (indicated by various sphere colors).The conditional expressions are intended to control neural development in a space-, time-, and context-dependent way, analogous to natural gene regulation.The expressions consist of a sequence of tokens of several types defining a Reverse Polish Notation (RPN) arithmetic expression, which operates on a set of four stacks (one stack of floating point values, one boolean stack, and two integer stacks; see (Palmer, 2011) for full details).As each token is evaluated in sequence, values may be popped from the stacks, specific operations performed on them (for example, two values might be summed, or tested for equality), and the result pushed back to a particular one of the stacks.Some tokens may push values onto the stacks that depend on time (the division number) or on the position of the neuron in space.After all the tokens in an expression have been evaluated, the stacks will in general hold a number of values.One of these values (the top value on the boolean stack) is used to determine whether the expression evaluates true, which permits a cell division to occur.Other values (from the floating point and boolean stacks) are used to specify parameters required by the neurons, for example, the weights applied to a neuron's inputs; see (Palmer, 2011) for details.A maximum of seven divisions are applied; the direction in space of the division also depends on values from the boolean stack.The size of each protoneuron in the panels of Figure 3 indicates the step at which it stopped dividing.
At the bottom center panel of Figure 3, the final division occurs, and one additional application of the rules converts some of the final protoneurons into neurons of several classes, including Sigmoid (purple), Delay (cyan), and Oscillating (yellow).Sigmoid neurons sum their inputs plus a "bias" value, and apply a sigmoid normalization function to keep the output in the range [-1, 1].Delay neurons take their input and buffer it for a certain number of time steps in a FIFO queue, then output it.Oscillating neurons oscillate sinusoidally between -1 and 1 over a fixed period; they have no inputs; see (Palmer, 2011).Also in the bottom center panel, a fixed set of 18 Input neurons (green, the "visual cortex" neurons from Figure 2) and 18 Output neurons (red, each of which will control one of the 18 DOF of the robot) are introduced.
In the bottom right panel of Figure 3, synaptic connections are formed.These also grow according to the inherited rule set: briefly, the final neurons have a set of "preferred" types to which they would like form connections to, and from; these preferred types are called "want-ins" and "want-outs", and are supplied, respectively, by the two integer stacks.The final connections that are made satisfy a combination of these preferences with a locality requirement.The L-Brain method itself is not the focus of this paper, but see (Palmer, 2011) for much more detail.A video of the unfolding developmental process is available here: http://www.youtube.com/alifespider

Evolutionary Parameters
Predator and prey interactions.In (Palmer, 2011), we successfully used a single species to evolve a neural controller that would direct the 18 DOF of the robot to produce a "galloping" gait, and then track a compass heading to gallop to the North.In this paper, our goal is to study the interaction of two co-evolving species, one predator and one prey, selected for hunting and evasion behavior.The two species have identical body configuration and physical strength (maximum motor torque in each DOF), and their brains have identical growth constraints (same number of divisions, neuron types, etc.), but the two species are scored differently.A predator individual receives credit for "eating" a prey individual, by physically colliding with it; the prey is penalized for being eaten, and rewarded for eating inanimate barrier objects.(More details on scoring given below.) Fitness evaluation in physically simulated local "demes".We place N=25 individuals of each species into D separate demes (local populations), where D ranged from 16 to 320; thus the total metapopulation size is ND individuals, ranging from 400 to 8,000, of each species.Each individual has a distinct genotype, i.e., a distinct set of inherited rules.Both species are asexual.All the 2N robot bodies in a single deme are simulated together, along with N barrier objects; thus they may physically interact.Fitness is relative among all individuals of each species, within one deme.A single evaluation lasts for 2,000 time steps of 1/30 second each, for a total of just over 1 minute of simulated time.During this time, robots accumulate a score at each time step, according to the details of the physical simulation, including the velocities of the robots, and whether collision events between objects occur.When a prey is "eaten", it receives a score penalty, but does not disappear from the simulation; rather it is "regenerated" (retaining its accumulated score) in a new random location and the simulation proceeds.Similarly, when a barrier is eaten, it also moves to a new location.Individuals migrate to a new random deme at a rate of 0.01 per generation; which connects all demes into a large metapopulation.Thus one evolutionary generation consists of: 1) fitness evaluation via 2,000 time steps of physical simulation; 2) reproduction according to relative fitnesses; 3) possible mutation of the "rules" making up each genotype (at a rate of 0.05 per rule per generation); and 4) migration among demes.Evolutionary runs of 3,000 to 10,000 generations were typically performed.

Results
As cited above, it has long been suggested that competitive co-evolution can drive adaptive progress.Therefore, we had initially looked to co-evolution as a "magic bullet" that would provide an indefinite series of challenges, as the two species engaged in an arms race of adaptive improvement.However, we found that, in practice, a proper balance of selective forces was quite difficult to get right.When it seems that an interesting predator-prey interaction may be observed almost anywhere one looks in nature, it is actually the case that the present two-species interactions are themselves the result of a selective process.That is, we only observe those two-species interactions where both species have not died out, either due to extinction of the prey (when the predator is too efficient a hunter), or extinction of the predator (when the prey is too effective at escaping), or both.In our simulations, we do not allow either species to go extinct; we use a fixed population size and relative fitnesses.However, instead of extinction, a common result in our initial experiments was evolutionary stagnation; for example, that the prey would not evolve a forward running gait, because doing so causes them to risk blindly running into predators.Thus the initial part of our experiments was characterized by manual tuning of the scoring function, in a sequence of attempts to get the two species to evolve a galloping gait, and to interact, as follows.
Initial selection for forward motion.Our first experiments did not include the barrier objects, only the predators and prey, and we conducted them with D=16 demes of N=25, or ND=400 individuals of each species.Most initial randomly generated genotypes (L-Brain rule sets) produce no motion in the bodies they control; most do not even make any connections to their output neurons.However, a few genotypes may produce a wiggling motion in the robot, which may be improved gradually by evolution into a galloping gait, and by the ability to steer according to the inputs from the visual cortex, ultimately producing the ability to hunt or evade the other species.Although our focus was hunting and evasion behavior, we expected that some initial selection for basic forward motion would greatly speed up evolution of hunting, because stationary robots all fail to hunt, or evade, and thus cannot be differentially rewarded for this behavior.Therefore, we initially included a positive bonus to reward forward motion in both species.The result was that both species would typically evolve a forward gait in less than 300 generations.

Introduction of barrier objects.
We next introduced an additive reward for the predator, and an additive penalty for the prey, for captures of a prey by a predator.Unfortunately, if the penalty to the prey was strong, this causes the prey to evolve a dramatically reduced speed, or to stop; forward motion causes it to risk running into a predator, and slowing or stopping reduces this risk.To encourage the prey to continue forward motion, we introduced the barrier objects, and a reward to the prey for capturing them; this reward was large: four times the penalty of being caught by a predator.Even when a prey is running blindly (without use of input from the visual cortex), it receives a reward from time to time by blindly colliding with barriers, and over a 2,000 time step evaluation, this is less on average than the total penalty from colliding with predators.
Direct selection on network properties.At this point in our experiments, both predator and prey would easily evolve to run forward blindly, but their networks usually did not possess any connective pathways from the visual cortex Input neurons to any of the Output neurons that directly activate the legs.Such pathways are necessary for steering by visual cues.An example of such a "blind" brain is shown in Figure 4.The color and thickness of the lines connecting the neurons indicates the sign (black indicating positive sign, and red negative) and magnitude of their weight.In this network, when the yellow Oscillating neuron pulsates, the several connected red Output neurons also pulsate, either in phase if they are connected by a black line, or in opposite phase if connected by a red line, producing a pattern of movement in the legs that generates a running gait.The speed of the gait produced by this network is quite fast, but it does not steer Evolved neural network controllers for physically simulated robots that hunt with an artificial visual cortex according to visual cues, because the Inputs do not pass a signal to the Outputs by any pathway.
We decided to select directly on properties of the neural networks.However, a bonus proportional to the total number of neurons was unproductive.A bonus proportional to the longest connected pathway was also unproductive.We found that a bonus that counted the total number of Input neurons that possessed some connective pathway to at least one Output did promote the evolution of running with visual steering.If totalInputsConnected is the number of Inputs that have some pathway downstream to some Output, then we compute the factor F1 = pow(1.10,min(5, totalInputsConnected)), and multiply the total score by this factor.The factor F1 provides a 10% multiplicative reward for up to 5 Inputs that are connected by some path to an Output.This reward biases variation to the neighborhoods of networks that we desire, i.e., those that have Inputs somehow connected to Outputs.We subsequently also added an additional 2% multiplicative bonus F2 = pow(1.02,min(5, longestPath)), where longestPath is the longest path present in the network.This bonus rewards networks with longer pathways (up to a length of 5) in order to speed the stage of evolution where complete pathways between the Inputs and Outputs are being found.

Density of objects.
At the beginning of an evaluation, all spider bodies and barrier objects are initially created in random positions within a circle of a certain radius.When objects are "eaten", they are "regenerated" in a new position, with the new positions similarly distributed within the same circle.If a spider of either species ever runs outside the perimeter of the circle, it is also moved to a new interior location; this prevents the spiders from scattering so far that they cease to interact.We found that the density of spiders and objects imposed by this limiting circle was important: if they are too dense, then captures happen too easily by accident; if they are very sparse, then too few captures happen within an evaluation run of 2,000 time steps, so selection is noisy (i.e., too dependent upon the luck of being initially near a target).We arrived at a suitable radius for the containing circle for the N=25 spiders of each species and the 25 barriers, i.e., about 50 simulation length units, where the span of the spider's legs in a relaxed stance is 1.5 units; the resulting density can be seen in Figure 5.

Initial hunting success.
With the above scoring function and object density, we began to have success evolving hunting behavior in both species, using N=25 and D=16.In Figure 5, an example of successful hunting behavior in both species is shown: a predator tracks a prey, which is itself tracking a barrier object.The spiders leave colored "breadcrumbs" behind them to make their recent track visible (however, the breadcrumbs are not visible to the spiders).A video of successful hunting behavior is available at: http://www.youtube.com/alifespiderOne alternative outcome to the evolution of hunting in both species is that the prey may become faster runners (i.e., by developing a more efficient gait) than the predators, such that even a predator that is successfully tracking a prey cannot catch up; this in turn reduces the selective advantage to the predators of good tracking, and they cease to improve it, or may even lose it.(Thus, interestingly, predators that are good at tracking get more "practice", and become better at it.)"Orbit the barrier" baiting behavior.One alternative adaptive strategy taken by the predators, if a deme enters the slower predator / faster prey condition, is what we call predators' "orbit the barrier" behavior.In some runs, the predators would circle around a barrier object, apparently waiting for prey to track toward the barrier.When the prey finally approaches and "eats" the barrier, it is not difficult for the predator to move into the center of the barrier (which has just disappeared, having been "eaten"), and capture the prey.In Figure 6, a predator circles a barrier as a prey approaches.A video of the "orbit the barrier" behavior is available at: http://www.youtube.com/alifespiderLarger metapopulations.With D=16, not every run would produce successful hunting in both species.Using a single node of our 20-node computing cluster (each node contains 2 E5520 4-core CPUs), we are able to conduct a D=16 run on a Figure 5: A predator (purple) tracking a prey (green), which is, in turn, tracking a barrier object.
Figure 6: A predator (center, purple) engages in "orbit the barrier" behavior, waiting for prey (green), to approach.
single cluster node at a rate of about 200 generations per hour, for N=25 (25 individuals of each species, and 25 barriers).In order to run larger metapopulations, we linked the 20 cluster nodes together by passing migrant individuals among them.That allowed us to run one large metapopulation of D=20*16=320 demes (or ND=8,000 of each species) on the entire cluster, at the same rate of 200 generations per hour.
Runs with larger metapopulations produced additional refinements to behavior.The "circle the barrier" behavior we previously described first arises by predators blindly bumping into a barrier, and having a gait that does not allow them to disengage from it.In larger runs, we commonly see this being refined by predators that can visually track barriers, close on them, and then circle them.This appears to occur when the prey are already accomplished trackers: a "baiting" predator is relying on the prey's tracking ability.
In these larger runs, we observed prey that shy away from predators: if one of these prey individuals is tracking towards a barrier, and the experimenter manually places a predator in its path, it will detect the predator (the appropriate "other" Inputs are connected, and activate) and divert its course, in order not to collide with the predator.Interestingly, we have also observed prey that will shy away from other prey (and we can see that the appropriate "same" Inputs are activated during this behavior).The adaptive value of this may be that two spiders that collide usually end up with their legs tangled together, which they often cannot disentangle, preventing them from running; two entangled prey are easy targets.
Interestingly, we have also, rarely, observed predators that track toward other predators, so that they collide with them; we are not sure whether this is adaptive or not.We have only observed it in the case of slow predators / fast prey, where the predators are also actively tracking the barriers; so it is possible that they benefit by tracking other predators when those predators are likely to already be circling barriers.If this behavior is in fact selected, it would be a third-order interaction, i.e., the prey are attracted to barriers; thus predators are attracted to the barriers; thus predators are attracted to other predators.This might be selective relative to being blind, but not relative to tracking the barriers directly.
Typical brain structure for hunting behavior.A common structure for a brain (here, a prey) that exhibits successful tracking behavior is shown in Figure 7.Only neurons that are "upstream" of some Output are shown in the figure, because only those can affect the gait.Two green Input neurons are so connected, "barrier0" and "barrier5".The typical behavior produced by such brains is to run in circles until a target object (a barrier in this case) appears (e.g., after being eaten and moved to a new location) near the spider.When one of the Inputs detects a target, the spider will turn left or right until it is moving toward the target.It zigzags back and forth as it closes on the target, with the target alternately activating the two Input neurons as it passes into their line of sight.Each activation causes a "zig" or a "zag" that diverts the path of the spider back toward the target, until it eventually closes on the target and "eats" it.
A common way this evolved tracking algorithm may fail is when two or more target objects are nearby on either side of the spider; this can cause tracking anomalies, such that capture fails.In addition, with moving targets (i.e., a prey being tracked by a predator), the target may move across one of the lines of site, and outside the "tracking cone", also causing failure to capture.Commonly, a spider is under time pressure to quickly capture a target that it has sighted, lest a competing spider get to it first.Videos of spiders competing in this way are available at: http://www.youtube.com/alifespiderComplex evolutionary dynamics.The evolutionary dynamics created by this rich environment can be complex.In Figure 8, the prey (green points) actually slow down between generations 700 and 800 (top panel, Distance Covered) while they simultaneously increase Captures of barriers (middle panel).This is associated with an increase in the number of Inputs Connected to Outputs (bottom panel) by some pathway, indicating that they have made trade-off of speed for better tracking ability.This is associated with a mean drop in Captures (middle panel) by the predators (red points), as well as a fitness decrease (not shown), indicating that they were relying on the prey blindly running into them for some captures.Their apparent response in the short term is to speed up, and reduce their mean number of Inputs Connected.It is not until generations 1100-1300 that the predators are able to increase their mean Captures again, not by increased speed, but apparently by better tracking, associated with a gradual increase in Inputs Connected to Outputs.

Conclusions
We have demonstrated a system that produces complex predator-prey dynamics, in a realistically modeled physical environment, with a generative developmental process producing the neural network controllers.
It turns out in practice that an "arms race" between predator and prey in artificial evolution is nontrivial to produce.For example, we initially encountered the situation where the prey would stop running, in order not to blindly run into predators, because it had not yet evolved an effective visual cortex.Thus we resorted to a number of ad hoc scoring changes, including direct selection for forward motion, and selection on network properties.In both cases, we are intelligently searching for evolutionary "stepping stones" to produce a particular result, Figure 7: A (prey) brain that exhibits successful tracking behavior by zig-zagging toward a barrier object.tracking behavior in this case.Presumably, the evolutionary process, even without such hints, (e.g., with selection only including a reward for hunting and a penalty for being hunted) would eventually find a solution (even by drift alone), but this could take far longer.Thus, ironically, in order to produce a rich predator-prey interaction, in the hope that this would produce open-ended evolution, we found it necessary to start the process with "scaffolding" hints.
One interesting effort that aims to make evolution more efficient is called "novelty search" (Lehman and Stanley, 2011), which keeps track of the regions of phenotypic space that have been previously searched, and does not produce similar organisms again.However, when phenotypic space becomes very multidimensional, it may not be straightforward to characterize and record the previously searched volume of phenotype space; nor is it clear how to reduce the dimensionality of this representation (to compress, and to search, the record) in general.For example, if the goal were to produce increasingly efficient running gaits in a single species, it might initially be sufficient to penalize individuals that exhibit the non-novel behavior of standing still (which many randomly generated rule sets produce).However, later on, when many individuals are running, this may be insufficient if the population settled into a local optimum.We might have to identify and measure many subtle aspects of running gaits to identify which behaviors we should call "similar", in order to reward novelty.We suggest that simple, static novelty metrics will fail to "scale up", in the sense of continuing to produce improvement.In general, the problem of defining an increasingly complex novelty metric will be similarly difficult to defining a sequence of evolutionary stepping stones.
Although evolutionary computing does not commonly employ a fitness that is negatively frequency-dependent (a fitness that decreases as the frequency of that phenotype increases), many natural processes produce such selection; "apostatic selection" is selection that favors individuals deviating from the norm (Ayala and Campbell, 1974).
We note that predator-prey interactions do force a temporally-local "novelty search" due to the apostatic selection of predator-prey interactions: when the predator adopts strategy A, and the prey adopts strategy B to counter it, then at least for a short time, the predator is forced to find a novel, non-A solution.
Importantly, no manual dimensionality-reduction of the phenotype space is required: the discouraged strategy (A) is encoded, in a sense, in the genome of the prey.When the predator changes to another strategy, and the prey follows, then this "memory" of the previously covered region of phenotype space is lost -or is it?It is possible for second-order selective effects related to evolvability (Wagner and Altenberg, 1996), i.e., the tendency to produce adaptive variation, to shape the genome: even through the prey is no longer currently expressing the B strategy, its genome may now be more easily able to re-evolve the B strategy.This produces a "memory" on a longer timescale: if the predator adopts A again, it may be more quickly countered with B. Predators that evolve a novel, non-A strategy, can thus be rewarded on this longer timescale.Only when A has been avoided -and novelty has been enforced -for a very long time may this "memory" eventually fade.A similar dynamic may occur when species compete for limited resources: when one resource is overexploited, novel use of available resources is favored.Good "resource switchers" may be favored in the long term.Similarly, not only are specific predation behaviors selected in the short term, but the general ability to evolve among a range of predation behaviors, in response to locally prevalent prey counter-strategies, may be favored in the long term.
This article has been a largely qualitative demonstration of the ability of our system, given some encouragement by scaffolding, to produce complex predator-prey interactions.The system does successfully produce third-order interactions (predators tracking other predators, which track barriers, which are tracked by prey; this increases the chance that the first predator type collides with prey).This clearly adds to the "richness" of behavioral interactions many levels removed from the mutation of rules in the genotype.Our intention now is to use this platform to study how this richness and diversity of phenotype can be made indefinitely self-sustaining.

Figure 1 :
Figure 1: A focal spider sees three other objects in its environment along three of its six lines of sight (arrows).

Figure 2 :
Figure 2: An "artificial visual cortex" registers the presence of three objects, indicated by the three large neurons.

Figure 3 :
Figure 3: From a single initial protoneuron, seven division steps produce many protoneurons.Cell differentiation into types (colors), and the direction of division (X, Y or Z) is dictated by an inherited set of growth rules.Certain protoneurons convert into neurons, and red Outputs and green Inputs appear (bottom center panel).Synaptic connections form (bottom right panel).

Figure 4 :
Figure 4: A "blind" galloper.The yellow Oscillator neuron at right produces a galloping gait by the pattern of pulsations induced in the red Output neurons.However, none of the green Input neurons has any pathway to any of the red outputs; this individual is blind.