Readme

00cd8587 · stephanebonnevay · f8b59471 · 00cd8587
Commit 00cd8587 authored 1 month ago by stephanebonnevay
--- a/README.md
+++ b/README.md
@@ -407,10 +407,40 @@ inconsistent and often irrational decision-making, failing to generate valid str
 Qwen3 struggles to generate valid strategies, reflecting limited high-level planning. However, it shows strong 
 first-order rationality when producing actions, especially under explicit or guided conditions, 
 and benefits from conditional reasoning. Its performance declines with implicit beliefs, highlighting limitations 
-in deeper inference. 
+in deeper inference.
+## Beliefs - MP
-## Beliefs
+Beliefs — whether implicit, explicit, or given — are crucial for an autonomous agent's decision-making process. They allow for anticipating the actions of other agents.
+### Refine beliefs
+To assess the agents' ability to refine their beliefs in predicting their interlocutor's next action, we consider the matching pennies game which is played between two players, an agent and the opponent. Each player has a penny and must secretly turn the penny to Head or Tail. The players then reveal their choices simultaneously. If the pennies match (both heads or both tails), then the agent wins 1 point. If not, then the opponent wins and the agent loses 1 point. The objective is to maximize your total gain.
+In this game:
+- the opponent follows a hidden strategy, i.e., a repetition model;
+- the agent must predict the opponent's next move (Head or Tail);
+- a correct prediction earns 1 point, while an incorrect one earns 0 points;
+- the game can be played for $N=10$ rounds, and the agent's accuracy is evaluated at each round.
+For our experiments, we consider two simple models for the opponent where:
+- the actions remain constant in the form of Head or Tail, respectively;
+- the actions follow an alternative form (Head-Trail or Trail-Head).
+We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
+Figures present the average points earned per round and the   95% confidence interval for each LLM against the two opponent behavior  
+models in the matching pennies game, whether the LLM generates a strategy or one-shot actions. 
+...
+![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg)
+![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg)
+![Prediction Accuracy per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_Altern.svg)
+![Points Earned per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_Altern.svg)
+## Beliefs - RPS
 Beliefs — whether implicit, explicit, or
 given — are crucial for an autonomous agent's decision-making process. They