Skip to content
Snippets Groups Projects
README.md 15.52 KiB

PyGAAMAS

Python Generative Autonomous Agents and Multi-Agent Systems aims to evaluate the social behaviors of LLM-based agents.

Dictator Game

The dictator game is a classic game which is used to analyze players' personal preferences. In this game, there are two players: the dictator and the recipient. Given two allocation options, the dictator needs to take action, choosing one allocation, while the recipient must accept the option chosen by the dictator. Here, the dictator’s choice is considered to reflect the personal preference.

Default preferences

The dictator’s choice reflect the LLM's preference.

The figure below presents a violin plot depicting the share of the total amount ($100) that the dictator allocates to themselves for each model. The temperature is fixed at 0.7, and each experiment was conducted 30 times. The median share taken by GPT-4.5, Llama3, Mistral-Small, and DeepSeek-R1 is 50. It is worth noticing that, under these standard conditions, humans typically keep an average of around $80 (Fortsythe et al. 1994). It is interesting to note that the variability observed between different executions in the responses of the same LLM is comparable to the diversity of behaviors observed in humans. In other words, this intra-model variability can be used to simulate the diversity of human behaviors based on their experiences, preferences, or context.

Fairness in Simple Bargaining Experiments Forsythe, R., Horowitz, J. L., Savin, N. E., & Sefton, M. Games and Economic Behavior, 6(3), 347-369. 1994.

Violin Plot of My Share for Each Model

The figure below represents the evolution of the share of the total amount ($100) that the dictator allocates to themselves as a function of temperature for each model, along with the 95% confidence interval. Each experiment was conducted 30 times. It can be observed that temperature influences the variability of the models' decisions. At low temperatures, choices are more deterministic and follow a stable trend, whereas at high temperatures, the diversity of allocations increases, reflecting a more random exploration of the available options.

My Share vs Temperature with Confidence Interval

Preference alignment

We define four preferences for the dictator:

  1. She prioritizes her own interests, aiming to maximize her own income (selfish).
  2. She prioritizes the other player’s interests, aiming to maximize their income (altruism).
  3. She focuses on the common good, aiming to maximize the total income between her and the other player (utilitarian).
  4. She prioritizes fairness between herself and the other player, aiming to maximize the minimum income (egalitarian).

We consider 4 allocation options where money can be lost in the division, each corresponding to one of the four preferences:

  1. The dictator keeps 500, the other player receives 100, and a total of 400 is lost in the division (selfish).
  2. The dictator keeps 100, the other player receives 500, and again, 400 is lost in the division (altruism).
  3. The dictator keeps 400, the other player receives 300, resulting in a 300 loss (utilitarian)
  4. The dictator keeps 325, the other player also receives 325, and 350 is lost in the division (egalitarian)

The following table shows the accuracy of the dictator's decision for each model and preference. The temperature is fixed at 0.7, and each experiment was conducted 30 times.

Model SELFISH ALTRUISTIC UTILITARIAN EGALITARIAN
gpt-4.5 1.0 1.0 0.5 1.0
llama3 1.0 0.9 0.4 0.73
mistral-small 0.4 0.93 0.76 0.16
deepseek-r1 0.06 0.2 0.76 0.03

Bad decisions can be explained either by arithmetic errors (e.g., it is not the case that 500 + 100 > 400 + 300) or by misinterpretations of preferences (e.g., ‘I’m choosing to prioritize the common interest by keeping a relatively equal split with the other player’).

This table can be used to evaluate the models based on their ability to align with different preferences. GPT-4.5 exhibits strong alignment across all preferences except for utilitarianism, where its performance is moderate. Llama3 demonstrates a strong ability to align with selfish and altruistic preferences, with moderate alignment for egalitarian preferences and lower alignment for utilitarian preferences. Mistral-small shows the best alignment with altruistic preferences, while maintaining a more balanced performance across the other preferences. Deepseek-r1 is most capable of aligning with utilitarian preferences, but performs poorly in aligning with other preferences.