diff --git a/README.md b/README.md index 5ca7a283b1299ac70a25e0b27d6a48403a845bec..811716a5b7c6422b6e5dee427ae3d2453c376960 100644 --- a/README.md +++ b/README.md @@ -48,14 +48,14 @@ to generative AAMAS. This list is a work in progress and will be regularly updat learning models in resource-constrained environments by making these models more lightweight without compromising too much on performance. - **[A survey of quantization methods for efficient neural - network inference](https://www.crcpress.com/Low-Power-Computer-Vision/Gholami-Kim-Dong-Yao-Mahoney-Keutzer/p/book/9780367707095)** + **[A survey of quantization methods for efficient neural + network inference](https://www.crcpress.com/Low-Power-Computer-Vision/Gholami-Kim-Dong-Yao-Mahoney-Keutzer/p/book/9780367707095)** Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer (2022) Published in *Low-Power Computer Vision*, Chapman and Hall/CRC, pp. 291–326. - **[Knowledge Distillation: A Survey](https://doi.org/10.1007/s11263-021-01453-z)** - Jianping Gou, Baosheng Yu, Stephen J. Maybank, Dacheng Tao (2021) - Published in *International Journal of Computer Vision*, Volume 129, pp. 1789–1819. + **[Knowledge Distillation: A Survey](https://doi.org/10.1007/s11263-021-01453-z)** + Jianping Gou, Baosheng Yu, Stephen J. Maybank, Dacheng Tao (2021) + Published in *International Journal of Computer Vision*, Volume 129, pp. 1789–1819. ## Large Language Models @@ -479,6 +479,28 @@ dilemma where aggressive strategies can persist or even dominate. Leibo, Michael Luck (2025) Published on arXiv +The authors consider LLMs that play finitely repeated games with full +information and analyze their behavior when competing against other LLMs as well +as simple, human-like strategies. Their findings show that GPT-4 acts +particularly unforgivingly in the iterated Prisoner’s Dilemma, always defecting +after another agent has defected just once. Additionally, it fails to follow the +simple convention of alternating between options in the Battle of the Sexes +game. GPT-4 performs poorly in these games due to a lack of coordination. These +behaviors are not caused by an inability to predict the other player’s actions +and persist across multiple robustness checks and variations in the payoff +matrices. Rather than adjusting its choices based on the other player, GPT-4 +consistently selects its preferred option. As a result, it fails to coordinate +with a simple, human-like agent—an instance of a behavioral flaw. However, these +behaviors can be modified. GPT-4 becomes more forgiving when explicitly reminded +that the other player might make mistakes. Furthermore, its coordination +improves when first prompted to predict the other player’s actions before +selecting its own. By prompting LLMs to imagine possible actions and their +outcomes before making a decision, the authors improve GPT-4’s behavior, leading +it to alternate more effectively. + +- **[Playing Repeated Games with Large Language Models](https://arxiv.org/abs/2305.16867)** + Elif Akata, Lion Schulz, Julian Coda-Forno, Seong Joon Oh, Matthias Bethge, Eric Schulz (2023) Published on arXiv + ### Generative MAS on the shelf