Add the ring-network game

0ddc3cf6 · Maxime MORGE · 8cb44156 · 0ddc3cf6 · 0ddc3cf6 · 0ddc3cf6
Commit 0ddc3cf6 authored 3 months ago by Maxime MORGE
--- a/.idea/csv-editor.xml
+++ b/.idea/csv-editor.xml
@@ -10,21 +10,63 @@
            </Attribute>
          </value>
        </entry>
-        <entry key="$PROJECT_DIR$/data/guess/guess.3.csv">
+        <entry key="$PROJECT_DIR$/data/dictator/dictator_setup.csv">
          <value>
            <Attribute>
              <option name="separator" value="," />
            </Attribute>
          </value>
        </entry>
-        <entry key="$PROJECT_DIR$/data/guess/guess.4.csv">
+        <entry key="$PROJECT_DIR$/data/ring/ring.1.a.csv">
          <value>
            <Attribute>
              <option name="separator" value="," />
            </Attribute>
          </value>
        </entry>
-        <entry key="$PROJECT_DIR$/data/guess/guess.csv">
+        <entry key="$PROJECT_DIR$/data/ring/ring.1.d.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/data/ring/ring.2.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/figures/ring/ring_accuracy.1.a.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/figures/ring/ring_accuracy.1.b.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/figures/ring/ring_accuracy.1.c.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/figures/ring/ring_accuracy.1.d.csv">
+          <value>
+            <Attribute>
+              <option name="separator" value="," />
+            </Attribute>
+          </value>
+        </entry>
+        <entry key="$PROJECT_DIR$/figures/ring/ring_accuracy.2.csv">
          <value>
            <Attribute>
              <option name="separator" value="," />

--- a/.idea/misc.xml
+++ b/.idea/misc.xml
@@ -3,4 +3,5 @@
  <component name="Black">
    <option name="sdkName" value="Python 3.12" />
  </component>
+  <component name="ProjectRootManager" version="2" project-jdk-name="Python 3.12" project-jdk-type="Python SDK" />
 </project>
\ No newline at end of file
--- a/README.md
+++ b/README.md
@@ -136,6 +136,98 @@ We observe that the performance of LLMs is barely better than that of a random s
 ![Average Points Earned per Round Against 3-Loop Behaviour (with 95% Confidence Interval)](figures/rps/rps_3loop.svg)
+## Ring-network game
+A player is rational if she plays a best response to her beliefs.
+She satisfies second-order rationality if she is rational and also believes that others are rational.
+In other words, a second-order rational agent not only considers the best course of action for herself
+but also anticipates how others make their decisions.
+The experiments conduct by Kneeland (2015) demonstrate that 93% of the subjects are rational, 
+while 71% exhibit second-order rationality.
+**[Identifying Higher-Order Rationality](https://doi.org/10.3982/ECTA11983)**  
+Terri Kneeland (2015) Published in *Econometrica*, Volume 83, Issue 5, Pages 2065-2079  
+DOI: [10.3982/ECTA11983](https://doi.org/10.3982/ECTA11983)
+Ring games are designed to isolate the behavioral implications of different levels of rationality.
+To assess players’ first- and second-order rationality, we consider a simplified version of the ring-network game.
+This game features two players, each with two available strategies, where both players aim to maximize their own payoff.
+The corresponding payoff matrix is shown below:
+| Player 1 \ Player 2 | Strategy A | Strategy B |
+|---------------------|------------|-----------|
+| **Strategy X**     | (15,10)    | (5,5)     |
+| **Strategy Y**     | (0,5)      | (10,0)    |
+If Player 2 is rational, she must choose A, as B is strictly dominated (i.e., B is never a best response to any beliefs Player 2 may hold).
+If Player 1 is rational, she can choose either X or Y since X is the best response if she believes Player 2 will play A and 
+Y is the best response if she believes Player 2 will play B.
+If Player 1 satisfies second-order rationality (i.e., she is rational and believes Player 2 is rational), then she must play Strategy X.
+This is because Player 1, believing that Player 2 is rational, must also believe Player 2 will play A and 
+since X is the best response to A, Player 1 will choose X.
+We establish three types of belief:
+- *implicit* belief: The optimal action must be inferred from the natural language description of the payoff matrix.
+- *explicit* belief: This belief focuses on analyzing Player 2’s actions, where Strategy B is strictly dominated by Strategy A.
+- *given* belief: The optimal action for Player 1 is explicitly stated in the prompt.
+We set up three forms of belief:
+- *implicit* belief where the optimal action must be deduced from the description 
+  of the payoff matrix in natural language;
+- *explicit* belief which analyze actions of Player 2 (B is strictly dominated by A).
+- *given* belief* where optimal action of Player 1is explicitly provided in the prompt;
+### Player 2
+The models evaluated include Mistral-Small, Llama3, and DeepSeek-R1. 
+The results indicate how well each model performs under each belief type.
+| Model          | Given    | Explicit  | Implicit |
+|----------------|---------|-----------|----------|
+| mistral-small  | 1.00    | 1.00      | 0.87     |
+| llama3         | 1.00    | 0.90      | 0.17     |
+| deepseek-r1    | 0.83    | 0.57      | 0.60     |
+Here’s a refined version of your text:
+Mistral-Small consistently outperforms the other models across all belief types. 
+Its strong performance with implicit belief indicates that it can effectively 
+deduce the optimal action from the payoff matrix description. 
+Llama3 performs well with a given belief, but significantly underperforms with an implicit belief, 
+suggesting it may struggle to infer optimal actions solely from natural language descriptions.
+DeepSeek-R1 shows the weakest performance, particularly with explicit beliefs, 
+indicating it may not be a good candidate to simulate rationality as the other models.
+### Player 1
+In order to adjust the difficulty of taking the optimal
+action, we consider 4 versions of the player’s payoff matrix: 
+- a. is the original setup; 
+- b. we reduce the difference in payoffs;
+- c.  we increase the expected payoff for the incorrect choice Y
+- d. we decrease the expected payoff for the correct choice X.
+| **Action \ Opponent Action (version)** | **A(a)** | **B(a)** | | **A(b)** | **B(b)** | | **A(c)** | **B(c)** | | **A(d)** | **B(d)** |
+|----------------------------------------|----------|----------|-|----------|----------|-|----------|----------|-|----------|----------|
+| **X**                                  | 15       | 5        | | 8        | 7        | | 6        | 5        | | 15       | 5        |
+| **Y**                                  | 0        | 10       | | 7        | 8        | | 0        | 10       | | 0        | 40       |
+| Model         | | Given (a) | Explicit (a) | Implicit (a) | | Given (b) | Explicit (b) | Implicit (b) |  | Given (c) | Explicit (c) | Implicit (c) |  | Given (d) | Explicit (d) | Implicit (d) |
+|---------------|-|-----------|--------------|--------------|-|-----------|--------------|--------------|--|-----------|--------------|--------------|--|-----------|--------------|--------------|
+| llama3        | | 0.97      | 1.00         | 1.00         | | 0.77      | 0.80         | 0.60         |  | 0.97      | 0.90         | 0.93         |  | 0.83      | 0.90         | 0.60         |
+| mistral-small | | 0.93      | 0.97         | 1.00         | | 0.87      | 0.77         | 0.60         |  | 0.77      | 0.60         | 0.70         |  | 0.73      | 0.57         | 0.37         |
+| deepseek-r1   | | 0.80      | 0.53         | 0.57         | | 0.67      | 0.60         | 0.53         |  | 0.67      | 0.63         | 0.47         |  | 0.70      | 0.50         | 0.57         |
+LLama3 demonstrates the most consistent and robust performance, capable of adapting to various belief types 
+and adjusted payoff matrices. 
+Llama3, while performing well with given and explicit beliefs, faces challenges in implicit belief, particularly in version (d). 
+DeepSeek-R1 appears to be the least capable, suggesting it may not be an ideal candidate for modeling second-order rationality.
 ## Authors
 Maxime MORGE

--- a/data/ring/ring.1.a.csv
+++ b/data/ring/ring.1.a.csv
--- a/data/ring/ring.1.b.csv
+++ b/data/ring/ring.1.b.csv
--- a/data/ring/ring.1.c.csv
+++ b/data/ring/ring.1.c.csv
--- a/data/ring/ring.1.d.csv
+++ b/data/ring/ring.1.d.csv
--- a/data/ring/ring.2.csv
+++ b/data/ring/ring.2.csv
--- a/figures/ring/ring_accuracy.1.a.csv
+++ b/figures/ring/ring_accuracy.1.a.csv
+Model,Given,Explicit,Implicit
+deepseek-r1,0.8,0.5333333333333333,0.5666666666666667
+llama3,0.9666666666666667,1.0,1.0
+mistral-small,0.9333333333333333,0.9666666666666667,1.0
--- a/figures/ring/ring_accuracy.1.b.csv
+++ b/figures/ring/ring_accuracy.1.b.csv
+Model,Given,Explicit,Implicit
+deepseek-r1,0.6666666666666666,0.6,0.5333333333333333
+llama3,0.7666666666666667,0.8,0.6
+mistral-small,0.8666666666666667,0.7666666666666667,0.6
--- a/figures/ring/ring_accuracy.1.c.csv
+++ b/figures/ring/ring_accuracy.1.c.csv
+Model,Given,Explicit,Implicit
+deepseek-r1,0.6666666666666666,0.6333333333333333,0.4666666666666667
+llama3,0.9666666666666667,0.9,0.9333333333333333
+mistral-small,0.7666666666666667,0.6,0.7
--- a/figures/ring/ring_accuracy.1.d.csv
+++ b/figures/ring/ring_accuracy.1.d.csv
+Model,Given,Explicit,Implicit
+deepseek-r1,0.7,0.5,0.5666666666666667
+llama3,0.8333333333333334,0.9,0.6
+mistral-small,0.7333333333333333,0.5666666666666667,0.36666666666666664
--- a/figures/ring/ring_accuracy.2.csv
+++ b/figures/ring/ring_accuracy.2.csv
+Model,Given,Explicit,Implicit
+deepseek-r1,0.8333333333333334,0.5666666666666667,0.6
+llama3,1.0,0.9,0.16666666666666666
+mistral-small,1.0,1.0,0.8666666666666667
--- a/src/ring/__init__.py
+++ b/src/ring/__init__.py
--- a/src/ring/belief.py
+++ b/src/ring/belief.py
+from enum import Enum
+class Belief(Enum):
+    IMPLICIT = ("Implicit", "A belief that is assumed or inferred")
+    EXPLICIT = ("Explicit", "A belief that is clearly stated or expressed")
+    GIVEN = ("Given", "A belief that is directly provided as a fact")
+    def __init__(self, label, description):
+        self.label = label
+        self.description = description
\ No newline at end of file
--- a/src/ring/ring.py
+++ b/src/ring/ring.py
+import os
+import asyncio
+from typing import Dict, Literal
+from networkx.algorithms.threshold import swap_d
+from pydantic import BaseModel
+from autogen_agentchat.agents import AssistantAgent
+from autogen_agentchat.messages import TextMessage
+from autogen_core import CancellationToken
+from autogen_ext.models.openai import OpenAIChatCompletionClient
+import json
+from torchgen.dest.ufunc import eligible_for_binary_scalar_specialization
+from belief import Belief
+from sympy.physics.units import action
+# Load API key from environment variable
+OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
+if not OPENAI_API_KEY:
+    raise ValueError("Missing OPENAI_API_KEY. Set it as an environment variable.")
+# Define the expected response format as a Pydantic model
+class AgentResponse(BaseModel):
+    action: Literal["A", "B", "X", "Y"]
+    reasoning: str
+# The ring game simulation class
+class Ring:
+    debug=False
+    def __init__(self, player_id: int, belief: Belief, swap: bool, version: str, model: str, temperature: float, max_retries: int = 3):
+        self.player_id = player_id
+        self.belief = belief
+        self.swap = swap
+        self.A, self.B, self.X, self.Y = ("B", "A", "Y", "X") if swap else ("A", "B", "X", "Y")
+        self.version = version
+        self.model = model
+        self.temperature = temperature
+        self.max_retries = max_retries  # Maximum retry attempts in case of hallucinations
+        is_openai_model = model.startswith("gpt")
+        base_url = "https://api.openai.com/v1" if is_openai_model else "http://localhost:11434/v1"
+        model_info = {
+            "temperature": self.temperature,
+            "function_calling": True,
+            "parallel_tool_calls": True,
+            "family": "unknown",
+            "json_output": True,
+            "vision": False
+        }
+        self.model_client = OpenAIChatCompletionClient(
+            model=self.model,
+            base_url=base_url,
+            api_key=OPENAI_API_KEY,
+            model_info=model_info,
+            response_format=AgentResponse
+        )
+    async def run(self) -> Dict:
+        """Runs the model and ensures a valid response."""
+        action_description = (
+            ' - `"action"`: Your move ("A" or "B")' if self.player_id == 2
+            else ' - `"action"`: Your move ("X" or "Y")'
+        )
+        XknowingA, XknowingB, YknowingA, YknowingB = (
+            (15, 5, 0, 10) if self.version == "a" else
+            (8, 7, 7, 8) if self.version == "b" else
+            (6, 5, 0, 10) if self.version == "c" else
+            (15, 5, 0, 40)
+        )
+        implicit_belief = f"""
+        - If Player 1 chooses {self.X} and Player 2 chooses {self.A}, Player 1 receives {XknowingA} points, and Player 2 receives 10 points.
+        - If Player 1 chooses {self.X} and Player 2 chooses {self.B}, Player 1 gets {XknowingB} points, while Player 2 also gets 5 points.
+        - If Player 1 chooses {self.Y} and Player 2 chooses {self.A}, Player 1 receives {YknowingA} points, while Player 2 gets 5 points.
+        - If Player 1 chooses {self.Y} and Player 2 chooses {self.B}, Player 1 receives {YknowingB} points, and Player 2 gets 0 points.
+        """
+        explicit_belief = f"""
+        {implicit_belief}
+        {self.B} is strictly dominated by {self.A}
+        """
+        given_belief = f"""
+        {explicit_belief}
+        Player 2 must choose {self.A} if she is rational
+        """
+        rules = (
+            given_belief if self.belief == Belief.GIVEN
+            else explicit_belief if self.belief == Belief.EXPLICIT
+            else implicit_belief
+        )
+        instruction = f"""
+        We are playing a simple game designed to evaluate your ability to play rationally.
+        Your aim is to maximize your own payoff.
+        You are Player {self.player_id}.
+        ### **Game Rules:**        
+        {rules}
+        ### **Your Task:**  
+        Return your response in JSON format with two keys:  
+        {action_description}
+        - `"reasoning"`: A brief explanation of how you made your choice.
+        """
+        if self.debug:
+            print(instruction)
+        for attempt in range(self.max_retries):
+            agent = AssistantAgent(
+                name="Player",
+                model_client=self.model_client,
+                system_message="You are a helpful assistant."
+            )
+            response = await agent.on_messages(
+                [TextMessage(content=instruction, source="user")],
+                cancellation_token=CancellationToken(),
+            )
+            try:
+                response_data = response.chat_message.content
+                agent_response = AgentResponse.model_validate_json(response_data)  # Parse JSON
+                action, reasoning = agent_response.action, agent_response.reasoning
+                # Validate values
+                if self.player_id == 2 and (action == self.A or action == self.B) or (self.player_id == 1 and (action == self.X or action == self.Y)):
+                    rational = 1.0 if self.check_rationality(agent_response) else 0.0
+                    return {
+                        "action": agent_response.action,
+                        "rationality": rational,
+                        "reasoning": agent_response.reasoning
+                    }
+                else:
+                    print(f"Invalid response detected (Attempt {attempt+1}): {response_data}")
+            except Exception as e:
+                print(f"Error parsing response (Attempt {attempt+1}): {e}")
+        raise ValueError("Model failed to provide a valid response after multiple attempts.")
+    def check_rationality(self, agent_response: AgentResponse) -> bool:
+        """Check if the response is rational."""
+        if self.player_id == 2:
+            return agent_response.action == self.A
+        else:
+            return agent_response.action == self.X
+# Run the async function and return the response
+if __name__ == "__main__":
+    game_agent = Ring(1, Belief.IMPLICIT, swap = True, version="a", model="llama3", temperature=0.7)
+    response_json = asyncio.run(game_agent.run())
+    print(response_json)
\ No newline at end of file
--- a/src/ring/ring_experiments.py
+++ b/src/ring/ring_experiments.py
+import asyncio
+import os
+import pandas as pd
+from belief import Belief
+from ring import Ring
+class RingExperiment:
+    debug = True
+    def __init__(self, models: list[str], player_id: int, version: str, temperature: float, iterations: int, output_file: str):
+        self.models = models
+        self.player_id = player_id
+        self.version = version
+        self.temperature = temperature
+        self.iterations = iterations
+        self.output_file = output_file  # Path to the CSV output file
+    # Helper function to escape double quotes in the motivations string
+    def protect_reasoning(self, reasoning):
+        if reasoning:
+            # Échapper les guillemets doubles dans motivations en doublant les guillemets
+            return f'"{reasoning.replace("\"", "\"\"")}"'
+        return reasoning
+    async def run_experiment(self):
+        beliefs = [Belief.GIVEN, Belief.EXPLICIT, Belief.IMPLICIT]
+        file_exists = os.path.isfile(self.output_file)  # Check if file already exists
+        # Run the dictator game for each model and preference
+        for model in self.models:
+            if self.debug:
+                print(f"Running experiment for model: {model}")
+            for belief in beliefs:
+                print(f"Running with belief: {belief.name}")
+                for iteration in range(1, self.iterations + 1):
+                    print(f"Iteration: {iteration}")
+                    # Initialize the Ring player for the current iteration
+                    game_agent = Ring(
+                        player_id=self.player_id,
+                        belief=belief,
+                        swap=True,
+                        version=self.version,  # Corrected placement
+                        model=model,
+                        temperature=self.temperature
+                    ) if iteration % 2 == 0 else Ring(
+                        player_id=self.player_id,
+                        belief=belief,
+                        swap=False,
+                        version=self.version,  # Corrected placement
+                        model=model,
+                        temperature=self.temperature
+                    )
+                    try:
+                        agent_response = await game_agent.run()
+                        action = agent_response['action']
+                        rationality = agent_response['rationality']
+                        reasoning = agent_response['reasoning']
+                        # Protect the reasoning string by escaping double quotes
+                        reasoning = self.protect_reasoning(reasoning)
+                    except Exception as e:
+                        print(f"Error in iteration {iteration} for model {model} : {e}")
+                        action, reasoning, rationality = None, None, None
+                    # Create a single-row DataFrame for the current result
+                    df = pd.DataFrame([{
+                        'Iteration': iteration,
+                        'Model': model,
+                        'Temperature': self.temperature,
+                        'Belief': belief.label,
+                        'action': action,
+                        'rationality': rationality,
+                        'reasoning': reasoning
+                    }])
+                    # Append results to the CSV file
+                    df.to_csv(self.output_file, mode='a', header=not file_exists, index=False)
+                    file_exists = True  # Ensure header is only written once
+# Running the experiment
+if __name__ == "__main__":
+    models = ["llama3", "mistral-small", "deepseek-r1"] # or gpt-4.5-preview-2025-02-27
+    temperature = 0.7
+    iterations = 30
+    player_id = 1
+    version = "a"
+    output_file = f"../../data/ring/ring.{player_id}.{version}.csv"
+    experiment = RingExperiment(models=models, player_id = player_id, version = version, temperature = temperature, iterations=iterations, output_file = output_file)
+    asyncio.run(experiment.run_experiment())
+    print(f"Experiment results saved to {output_file}")
\ No newline at end of file
--- a/src/ring/ring_player1_draw.py
+++ b/src/ring/ring_player1_draw.py
+import pandas as pd
+def process_experiment_results(version: str):
+    """Loads experiment results, calculates accuracy, reorders columns, and saves to CSV."""
+    # Load the experiment results
+    df = pd.read_csv(f"../../data/ring/ring.1.{version}.csv")
+    # Calculate the accuracy by model and belief
+    accuracy_table = df.groupby(["Model", "Belief"])["rationality"].mean().unstack()
+    # Reorder the columns in the desired order
+    desired_order = ["Given", "Explicit", "Implicit"]
+    accuracy_table = accuracy_table.reindex(columns=desired_order)
+    # Display the table
+    print(f"Accuracy table for version {version}\n")
+    print(accuracy_table)
+    # Save the table as a CSV file for future use
+    accuracy_table.to_csv(f"../../figures/ring/ring_accuracy.1.{version}.csv")
+# Process all versions
+for version in ["a", "b", "c", "d"]:
+    process_experiment_results(version)
\ No newline at end of file
--- a/src/ring/ring_player2_draw.py
+++ b/src/ring/ring_player2_draw.py
+import pandas as pd
+# Load the experiment results
+df = pd.read_csv("../../data/ring/ring.2.csv")
+# Calculate the accuracy by model and belief
+accuracy_table = df.groupby(["Model", "Belief"])["rationality"].mean().unstack()
+desired_order = ["Given", "Explicit", "Implicit"]
+accuracy_table = accuracy_table.reindex(columns=desired_order)
+# Display the table
+print(accuracy_table)
+# Save the table as a CSV file for future use
+accuracy_table.to_csv("../../figures/ring/ring_accuracy.2.csv")
\ No newline at end of file