Skip to content
Snippets Groups Projects
Commit 00cd8587 authored by stephanebonnevay's avatar stephanebonnevay
Browse files

Readme

parent f8b59471
No related branches found
No related tags found
No related merge requests found
...@@ -407,10 +407,40 @@ inconsistent and often irrational decision-making, failing to generate valid str ...@@ -407,10 +407,40 @@ inconsistent and often irrational decision-making, failing to generate valid str
Qwen3 struggles to generate valid strategies, reflecting limited high-level planning. However, it shows strong Qwen3 struggles to generate valid strategies, reflecting limited high-level planning. However, it shows strong
first-order rationality when producing actions, especially under explicit or guided conditions, first-order rationality when producing actions, especially under explicit or guided conditions,
and benefits from conditional reasoning. Its performance declines with implicit beliefs, highlighting limitations and benefits from conditional reasoning. Its performance declines with implicit beliefs, highlighting limitations
in deeper inference. in deeper inference.
## Beliefs - MP
## Beliefs Beliefs — whether implicit, explicit, or given — are crucial for an autonomous agent's decision-making process. They allow for anticipating the actions of other agents.
### Refine beliefs
To assess the agents' ability to refine their beliefs in predicting their interlocutor's next action, we consider the matching pennies game which is played between two players, an agent and the opponent. Each player has a penny and must secretly turn the penny to Head or Tail. The players then reveal their choices simultaneously. If the pennies match (both heads or both tails), then the agent wins 1 point. If not, then the opponent wins and the agent loses 1 point. The objective is to maximize your total gain.
In this game:
- the opponent follows a hidden strategy, i.e., a repetition model;
- the agent must predict the opponent's next move (Head or Tail);
- a correct prediction earns 1 point, while an incorrect one earns 0 points;
- the game can be played for $N=10$ rounds, and the agent's accuracy is evaluated at each round.
For our experiments, we consider two simple models for the opponent where:
- the actions remain constant in the form of Head or Tail, respectively;
- the actions follow an alternative form (Head-Trail or Trail-Head).
We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
Figures present the average points earned per round and the 95% confidence interval for each LLM against the two opponent behavior
models in the matching pennies game, whether the LLM generates a strategy or one-shot actions.
...
![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg)
![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg)
![Prediction Accuracy per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_Altern.svg)
![Points Earned per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_Altern.svg)
## Beliefs - RPS
Beliefs — whether implicit, explicit, or Beliefs — whether implicit, explicit, or
given — are crucial for an autonomous agent's decision-making process. They given — are crucial for an autonomous agent's decision-making process. They
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment