Skip to content
Snippets Groups Projects
Commit cfc39344 authored by Maxime Morge's avatar Maxime Morge :construction_worker:
Browse files

PyGAAMAS: evaluate Qwen3 in the investment game

parent b0169be4
No related branches found
No related tags found
No related merge requests found
...@@ -52,7 +52,7 @@ Figure below highlight significant differences in decision-making ...@@ -52,7 +52,7 @@ Figure below highlight significant differences in decision-making
consistency among the evaluated models. <tt>GPT-4.5</tt>, <tt>LLama3.3:latest</tt> consistency among the evaluated models. <tt>GPT-4.5</tt>, <tt>LLama3.3:latest</tt>
and <tt>DeepSeek-R1:7b</tt> stand out with a and <tt>DeepSeek-R1:7b</tt> stand out with a
perfect CCEI score of 1.0, indicating flawless rationality in decision-making. perfect CCEI score of 1.0, indicating flawless rationality in decision-making.
<tt>Mistral-Small</tt> and <tt>Mixtral:8x7b</tt> demonstrate the next highest level of rationality. <tt>Qwen3</tt>, <tt>Mistral-Small</tt> and <tt>Mixtral:8x7b</tt> demonstrate the next highest level of rationality.
<tt>Llama3</tt> performs moderately well, with CCEI values ranging between 0.2 and 0.74. <tt>Llama3</tt> performs moderately well, with CCEI values ranging between 0.2 and 0.74.
<tt>DeepSeek-R1</tt> exhibits <tt>DeepSeek-R1</tt> exhibits
inconsistent behavior, with CCEI scores varying widely between 0.15 and 0.83. inconsistent behavior, with CCEI scores varying widely between 0.15 and 0.83.
......
...@@ -234,4 +234,34 @@ iteration,model,temperature,ccei ...@@ -234,4 +234,34 @@ iteration,model,temperature,ccei
28,gpt-4.5-preview-2025-02-27,0.0,1.0 28,gpt-4.5-preview-2025-02-27,0.0,1.0
29,gpt-4.5-preview-2025-02-27,0.0,1.0 29,gpt-4.5-preview-2025-02-27,0.0,1.0
30,gpt-4.5-preview-2025-02-27,0.0,1.0 30,gpt-4.5-preview-2025-02-27,0.0,1.0
1,qwen3,0.0,1.0
2,qwen3,0.0,1.0
3,qwen3,0.0,1.0
4,qwen3,0.0,1.0
5,qwen3,0.0,1.0
6,qwen3,0.0,0.85
7,qwen3,0.0,1.0
8,qwen3,0.0,1.0
9,qwen3,0.0,1.0
10,qwen3,0.0,1.0
11,qwen3,0.0,1.0
12,qwen3,0.0,1.0
13,qwen3,0.0,1.0
14,qwen3,0.0,1.0
15,qwen3,0.0,1.0
16,qwen3,0.0,1.0
17,qwen3,0.0,1.0
18,qwen3,0.0,1.0
19,qwen3,0.0,1.0
20,qwen3,0.0,1.0
21,qwen3,0.0,0.85
22,qwen3,0.0,1.0
23,qwen3,0.0,1.0
24,qwen3,0.0,1.0
25,qwen3,0.0,1.0
26,qwen3,0.0,1.0
27,qwen3,0.0,1.0
28,qwen3,0.0,0.85
29,qwen3,0.0,1.0
30,qwen3,0.0,1.0
This diff is collapsed.
This diff is collapsed.
...@@ -4,11 +4,15 @@ import matplotlib.pyplot as plt ...@@ -4,11 +4,15 @@ import matplotlib.pyplot as plt
# Custom color palette # Custom color palette
color_palette = { color_palette = {
'random' : '#333333', # Black 'random': '#333333', # Black
'gpt-4.5-preview-2025-02-27': '#7abaff', # Blue 'gpt-4.5-preview-2025-02-27': '#7abaff', # BlueEscape
'llama3': '#32a68c', # Green 'llama3': '#32a68c', # GreenFuture
'mistral-small': '#ff6941', # Orange 'llama3.3:latest': '#4b9f7d', # GreenLlama3.3
'deepseek-r1': '#5862ed' # Indigo 'mistral-small': '#ff6941', # WarmOrange
'mixtral:8x7b': '#f1a61a', # YellowMixtral
'deepseek-r1': '#5862ed', # InclusiveIndigo
'deepseek-r1:7b': '#9a7bff', # PurpleDeepseek-r1:7b
'qwen3': '#000000'
} }
# Load CSV file # Load CSV file
......
...@@ -11,7 +11,8 @@ color_palette = { ...@@ -11,7 +11,8 @@ color_palette = {
'mistral-small': '#ff6941', # WarmOrange 'mistral-small': '#ff6941', # WarmOrange
'mixtral:8x7b': '#f1a61a', # YellowMixtral 'mixtral:8x7b': '#f1a61a', # YellowMixtral
'deepseek-r1': '#5862ed', # InclusiveIndigo 'deepseek-r1': '#5862ed', # InclusiveIndigo
'deepseek-r1:7b': '#9a7bff' # PurpleDeepseek-r1:7b 'deepseek-r1:7b': '#9a7bff', # PurpleDeepseek-r1:7b
'qwen3': '#000000'
} }
# Specify the order of models for the x-axis # Specify the order of models for the x-axis
...@@ -20,7 +21,8 @@ model_order = [ ...@@ -20,7 +21,8 @@ model_order = [
'gpt-4.5-preview-2025-02-27', 'gpt-4.5-preview-2025-02-27',
'llama3', 'llama3.3:latest', # Place llama3 and llama3.3:latest together 'llama3', 'llama3.3:latest', # Place llama3 and llama3.3:latest together
'mistral-small', 'mixtral:8x7b', # Bring mistral-small and mixtral:8x7b closer 'mistral-small', 'mixtral:8x7b', # Bring mistral-small and mixtral:8x7b closer
'deepseek-r1', 'deepseek-r1:7b' 'deepseek-r1', 'deepseek-r1:7b',
'qwen3'
] ]
......
...@@ -3,7 +3,7 @@ import csv ...@@ -3,7 +3,7 @@ import csv
from investment import Investment from investment import Investment
# Define models, temperature, and iterations # Define models, temperature, and iterations
models = ["deepseek-r1:7b"] # "gpt-4.5-preview-2025-02-27", "optimal", "random", "llama3", "mistral-small", "deepseek-r1", "mixtral:8x7b", "llama3.3:latest", models = ["qwen3"] # "gpt-4.5-preview-2025-02-27", "optimal", "random", "llama3", "mistral-small", "deepseek-r1", "mixtral:8x7b", "llama3.3:latest",
temperature = 0.0 temperature = 0.0
iterations = 30 iterations = 30
output_file = "../../data/investment/investment.csv" output_file = "../../data/investment/investment.csv"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment