Skip to content
Snippets Groups Projects
Commit 5127ede3 authored by stephanebonnevay's avatar stephanebonnevay
Browse files

Readme

parent 88756456
No related branches found
No related tags found
No related merge requests found
...@@ -429,10 +429,9 @@ For our experiments, we consider two simple models for the opponent where: ...@@ -429,10 +429,9 @@ For our experiments, we consider two simple models for the opponent where:
We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round. We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
Figures present the average points earned per round and the 95% confidence interval for each LLM against the two opponent behavior Figures present the average points earned and prediction per round (95% confidence interval) for each LLM against the two opponent behavior (constant and alternate) models in the matching pennies game.
models in the matching pennies game, whether the LLM generates a strategy or one-shot actions.
... Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> ...
![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg) ![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg)
![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg) ![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment