@@ -434,15 +434,14 @@ Figures present the average points earned and prediction per round (95% confiden
...
@@ -434,15 +434,14 @@ Figures present the average points earned and prediction per round (95% confiden
Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> were able to generate a valid strategy. The charts show that they are able to correctly predict their opponent's strategy after just a few rounds. They perfectly identify the fact that their opponent always plays the same move.
Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> were able to generate a valid strategy. The charts show that they are able to correctly predict their opponent's strategy after just a few rounds. They perfectly identify the fact that their opponent always plays the same move.
The predictions made by <tt>Mistral-Small<tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain.
The predictions made by <tt>Mistral-Small<tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain.
The models exhibit varied approaches to decision-making in the MP game.
<tt>GPT-4.5</tt> follows a fixed alternating pattern, switching between "Head" and "Tail" each turn, assuming the opponent behaves similarly.
<tt>Mistral-Small<tt> adopts a reactive strategy, analyzing the frequency of the opponent’s past moves and selecting the less common one. On the other hand, <tt>Qwen3</tt>, relies on randomness — choosing moves unpredictably while presuming the opponent will mimic its choice. <tt>LLaMA3</tt> does not implement a functioning strategy. Overall, these approaches demonstrate simplistic heuristics that may lack credibility and efficiency.








The models exhibit varied approaches to decision-making in the MP game.
<tt>GPT-4.5</tt> follows a fixed alternating pattern, switching between "Head" and "Tail" each turn, assuming the opponent behaves similarly.
<tt>Mistral-Small</tt> adopts a reactive strategy, analyzing the frequency of the opponent’s past moves and selecting the less common one. On the other hand, <tt>Qwen3</tt>, relies on randomness — choosing moves unpredictably while presuming the opponent will mimic its choice. <tt>LLaMA3</tt> does not implement a functioning strategy. Overall, these approaches demonstrate simplistic heuristics that may lack credibility and efficiency.