From 5fd418d2844f290165649bd480d631e9d6f47dcb Mon Sep 17 00:00:00 2001 From: stephanebonnevay <stephane.bonnevay@lizeo-group.com> Date: Fri, 6 Jun 2025 07:45:45 +0200 Subject: [PATCH] Readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index c35c049..3f3a912 100644 --- a/README.md +++ b/README.md @@ -432,7 +432,7 @@ We evaluate the models' ability to identify these behavioural patterns by calcul Figures present the average points earned and prediction per round (95% confidence interval) for each LLM against the two opponent behavior models (constant and alternate) in the matching pennies game. Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> were able to generate a valid strategy. The charts show that they are able to correctly predict their opponent's strategy after just a few rounds. They perfectly identify the fact that their opponent always plays the same move. -The predictions made by <tt>Mistral-Small<tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain. +The predictions made by <tt>Mistral-Small</tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain.   -- GitLab