@@ -429,10 +429,9 @@ For our experiments, we consider two simple models for the opponent where:
We evaluate the models' ability to identify these behavioural patterns by calculating the average number of points earned per round.
Figures present the average points earned per round and the 95% confidence interval for each LLM against the two opponent behavior
models in the matching pennies game, whether the LLM generates a strategy or one-shot actions.
Figures present the average points earned and prediction per round (95% confidence interval) for each LLM against the two opponent behavior (constant and alternate) models in the matching pennies game.
...
Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt>...

