From 00cd85877fa7fc983cc2e2e1c7e15f7aa9b22efb Mon Sep 17 00:00:00 2001
From: stephanebonnevay <stephane.bonnevay@lizeo-group.com>
Date: Thu, 5 Jun 2025 07:19:54 +0200
Subject: [PATCH] Readme

---
 README.md | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 619649a..ba1b2ec 100644
--- a/README.md
+++ b/README.md
@@ -407,10 +407,40 @@ inconsistent and often irrational decision-making, failing to generate valid str
 Qwen3 struggles to generate valid strategies, reflecting limited high-level planning. However, it shows strong 
 first-order rationality when producing actions, especially under explicit or guided conditions, 
 and benefits from conditional reasoning. Its performance declines with implicit beliefs, highlighting limitations 
-in deeper inference. 
+in deeper inference.
 
+## Beliefs - MP
 
-## Beliefs
+Beliefs â€” whether implicit, explicit, or given â€” are crucial for an autonomous agent's decision-making process. They allow for anticipating the actions of other agents.
+
+### Refine beliefs
+
+To assess the agents' ability to refine their beliefs in predicting their interlocutor's next action, we consider the matching pennies game which is played between two players, an agent and the opponent. Each player has a penny and must secretly turn the penny to Head or Tail. The players then reveal their choices simultaneously. If the pennies match (both heads or both tails), then the agent wins 1 point. If not, then the opponent wins and the agent loses 1 point. The objective is to maximize your total gain.
+
+In this game:
+- the opponent follows a hidden strategy, i.e., a repetition model;
+- the agent must predict the opponent's next move (Head or Tail);
+- a correct prediction earns 1 point, while an incorrect one earns 0 points;
+- the game can be played for $N=10$ rounds, and the agent's accuracy is evaluated at each round.
+
+For our experiments, we consider two simple models for the opponent where:
+- the actions remain constant in the form of Head or Tail, respectively;
+- the actions follow an alternative form (Head-Trail or Trail-Head).
+
+We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
+
+Figures present the average points earned per round and the   95% confidence interval for each LLM against the two opponent behavior  
+models in the matching pennies game, whether the LLM generates a strategy or one-shot actions. 
+
+...
+
+![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg)
+![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg)
+![Prediction Accuracy per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_Altern.svg)
+![Points Earned per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_Altern.svg)
+
+
+## Beliefs - RPS
 
 Beliefs â€” whether implicit, explicit, or
 given â€” are crucial for an autonomous agent's decision-making process. They
-- 
GitLab