@@ -429,10 +429,9 @@ For our experiments, we consider two simple models for the opponent where:
...
@@ -429,10 +429,9 @@ For our experiments, we consider two simple models for the opponent where:
We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
Figures present the average points earned per round and the 95% confidence interval for each LLM against the two opponent behavior
Figures present the average points earned and prediction per round (95% confidence interval) for each LLM against the two opponent behavior (constant and alternate) models in the matching pennies game.
models in the matching pennies game, whether the LLM generates a strategy or one-shot actions.
...
Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt>...



