Evolutionary game theory of Bellman agents

Investigating the origin of cooperation in its various guises for the first time from the ground up.

The need and impact

Interactions between animals or people are rarely simple. The key tool for modelling and explaining such interactions, game theory, particularly stochastic game theory (where agents select actions to move between states representing their universe to receive rewards, in the presence of noise) is usually built on a vocabulary of abstract strategies, such as “cooperate” and “defect”. This limits its explanatory power when it comes to phenomena that require one to explicitly address complex sequential behaviours.

Several persistent thorny questions have precisely this character; for example, how token-based cooperation arises, and – we argue – why males exist, rather than sexual reproduction based on a single sex, where two offspring could be raised independently by the paired organisms.

Evolutionary game theory has been immensely successful in explaining the development of strategies used by animals and humans. Reinforcement learning underlies many successful applications of machine learning. Connecting these two enables us to account for the evolution of some very interesting biological behaviours from a sequence of grounded actions all the way through to the complete emergence of the phenomena under investigation. We can thus illuminate the role played by fiat money in contemporary society, isolating the essence from its down-stream cumulative effects (inheritance, debt, etc).

A potential outcome is a novel and stable mechanism of symbol-facilitated cooperation, retaining the positive properties of money. In creating a model for how evolved agents trade resources for space, which will enhance the fundamental understanding of “how the world works”.

While clarifying the meaning of coercion has no tangible value, it is a prerequisite to coherent scientific discussion of the issues around power and inequality. Being able to apply the same framework to give a novel account for the maintenance of something as fundamental as sexual reproduction would be a triumph.

The approach

Evolutionary game theory is a way of understanding how living agents come to be the way they are, and why interactions take the form that they do.

As we have shown in work which is under review, for the specific case in which two agents try to swap objects, such sequential behaviours can be discovered using Bellman’s technique (value iteration, also known as reinforcement learning). More generally, weaving Bellman’s technique into evolutionary game theory will enable us to investigate the origin of cooperation in its various guises for the first time from the ground up: building behaviours as sequences of grounded actions.

While the target problems below seem different, each involves a sequence of interlocking steps, carried out by pairs of self-interested agents, evolving under selection pressure.

Research aims

To bring reinforcement learning into evolutionary game theory, and use it to make novel contributions to three cross-disciplinary questions:

  • The evolution of money: is money the only way to motivate cooperation between strangers or are there better options? Is debt just negative reputation? This builds on our previous work – a new game theoretic view of money as an enabler of help between strangers.
  • Power and space: deconstruct pairwise coercion in terms of sequential game play, and address the construction of living space, i.e. fundamental aspects of environments that favour (vs limit) coercion. In other words, what changes about swapping, if one of the things being exchanged is space?
  • Use Bellman’s technique to elaborate a completely novel view on the evolution of sexual reproduction, addressing one of the most persistent open questions in biology: the so-called “two-fold cost” of males, i.e., that males do not produce offspring, but do contribute 50% of the genetic material, resulting in a female requiring four offspring to have the same amount of her alleles in the next population as a single offspring using asexual reproduction.


  • Associate Professor Marcus Frean (Project Lead)
  • Professor Stephen Marsland
  • Dr Chrissie Painting