Skip to content
Snippets Groups Projects
Commit b44a21e8 authored by stephanebonnevay's avatar stephanebonnevay
Browse files

Readme

parent 7b8f9f33
No related branches found
No related tags found
No related merge requests found
...@@ -434,15 +434,14 @@ Figures present the average points earned and prediction per round (95% confiden ...@@ -434,15 +434,14 @@ Figures present the average points earned and prediction per round (95% confiden
Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> were able to generate a valid strategy. The charts show that they are able to correctly predict their opponent's strategy after just a few rounds. They perfectly identify the fact that their opponent always plays the same move. Against Constant behavior, <tt>GPT-4.5</tt> and <tt>Qwen3</tt> were able to generate a valid strategy. The charts show that they are able to correctly predict their opponent's strategy after just a few rounds. They perfectly identify the fact that their opponent always plays the same move.
The predictions made by <tt>Mistral-Small<tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain. The predictions made by <tt>Mistral-Small<tt>, <tt>LLaMA3</tt>, and <tt>DeepSeek-R1</tt> are not incorrect, but the moves played are not in line with these predictions, which leads to a fairly low expected gain.
The models exhibit varied approaches to decision-making in the MP game.
<tt>GPT-4.5</tt> follows a fixed alternating pattern, switching between "Head" and "Tail" each turn, assuming the opponent behaves similarly.
<tt>Mistral-Small<tt> adopts a reactive strategy, analyzing the frequency of the opponent’s past moves and selecting the less common one. On the other hand, <tt>Qwen3</tt>, relies on randomness — choosing moves unpredictably while presuming the opponent will mimic its choice. <tt>LLaMA3</tt> does not implement a functioning strategy. Overall, these approaches demonstrate simplistic heuristics that may lack credibility and efficiency.
![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg) ![Prediction Accuracy per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_ConstHT.svg)
![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg) ![Points Earned per Round by Actions Against Constant Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_ConstHT.svg)
![Prediction Accuracy per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_Altern.svg) ![Prediction Accuracy per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_prediction_Altern.svg)
![Points Earned per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_Altern.svg) ![Points Earned per Round by Actions Against Alternate Behaviour (with 95% Confidence Interval)](figures/mp/mp_payoff_Altern.svg)
The models exhibit varied approaches to decision-making in the MP game.
<tt>GPT-4.5</tt> follows a fixed alternating pattern, switching between "Head" and "Tail" each turn, assuming the opponent behaves similarly.
<tt>Mistral-Small</tt> adopts a reactive strategy, analyzing the frequency of the opponent’s past moves and selecting the less common one. On the other hand, <tt>Qwen3</tt>, relies on randomness — choosing moves unpredictably while presuming the opponent will mimic its choice. <tt>LLaMA3</tt> does not implement a functioning strategy. Overall, these approaches demonstrate simplistic heuristics that may lack credibility and efficiency.
## Beliefs - RPS ## Beliefs - RPS
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment