Skip to content
Snippets Groups Projects
Commit 90488997 authored by Maxime MORGE's avatar Maxime MORGE
Browse files

LLM4AAMAS: Addkumar25arxiv

parent 7216871c
No related branches found
No related tags found
No related merge requests found
......@@ -106,43 +106,6 @@ to generative AAMAS. This list is a work in progress and will be regularly updat
Machine Translation](https://arxiv.org/abs/1406.1078)** *Kyunghyun Cho,
Bartvan Merrienboer, Caglar Gulcehre, et al. (2014)* Published on *arXiv*
## Tuning
### Instruction tuning
- The fine-tuning of a pre-trained language model requires significantly fewer
data and computational resources, especially when parameter-efficient
approaches such as Low-Rank Adaptation (LoRA) are used.
**[LoRA: Low-Rank Adaptation of Large Language
Models](https://arxiv.org/abs/2106.09685)** Edward J. Hu, Yelong Shen,
Phillip Wallis, et al. (2021)* Published on *arXiv*
- The apparent mastery of textual understanding by LLMs closely resembles human
performance.
**[Language Models are Few-Shot
Learners](https://papers.nips.cc/paper/2020/file/fc2c7f9a3f3f86cde5d8ad2c7f7e57b2-Paper.pdf)**
Tom Brown, Benjamin Mann, Nick Ryder, et al. (2020)* Presented at *NeurIPS*
### Alignement tuning
- Instruction tuning aims to bridge the gap between the model’s original
objective — generating text — and user expectations, where users want the
model to follow their instructions and perform specific tasks.
**[Training language models to follow instructions with human
feedback](https://papers.nips.cc/paper/2022/hash/17f4c5f98073d1fb95f7e53f5c7fdb64-Abstract.html)**
*Long Ouyang, Jeffrey Wu, Xu Jiang, et al. (2022)* Presented at *NeurIPS*
- Strong alignment requires cognitive abilities such as understanding and
reasoning about agents’ intentions and their ability to causally produce
desired effects.
**[Strong and weak alignment of large language models with human
value](https://doi.org/10.1038/s41598-024-70031-3)** Khamassi, M., Nahon, M.
& Chatila, R. *Sci Rep** **14**, 19399 (2024).
## Existing LLMs
Many models are available at the following URLs:
......@@ -196,9 +159,66 @@ Many models are available at the following URLs:
Published in *Advances in Neural Information Processing Systems (NeurIPS
2023)*
## Prompt engineering
## Post-training
This survey explores post-training methodologies for LLMs, including
fine-tuning, reinforcement learning, and scaling techniques like LoRA and RAG.
Fine-tuning improves task-specific performance but risks overfitting and high
computational costs. Test-Time Scaling (TTS) optimizes inference dynamically
without updating the model, making it suitable for tasks with flexible
computational budgets. Pretraining and TTS serve different purposes—pretraining
enhances fundamental capabilities through extensive training, while TTS improves
performance at inference time. Pretraining is crucial for novel tasks requiring
new skills, whereas TTS is effective when base models already perform reasonably
well.
- **[LLM Post-Training: A Deep Dive into Reasoning Large Language Models](https://arxiv.org/abs/2502.21321)**
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham
Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H. S. Torr, Salman Khan,
Fahad Shahbaz Khan (2025) on *arXiv* (cs.CL).
### Tuning
#### Instruction tuning
- The fine-tuning of a pre-trained language model requires significantly fewer
data and computational resources, especially when parameter-efficient
approaches such as Low-Rank Adaptation (LoRA) are used.
**[LoRA: Low-Rank Adaptation of Large Language
Models](https://arxiv.org/abs/2106.09685)** Edward J. Hu, Yelong Shen,
Phillip Wallis, et al. (2021)* Published on *arXiv*
- The apparent mastery of textual understanding by LLMs closely resembles human
performance.
**[Language Models are Few-Shot
Learners](https://papers.nips.cc/paper/2020/file/fc2c7f9a3f3f86cde5d8ad2c7f7e57b2-Paper.pdf)**
Tom Brown, Benjamin Mann, Nick Ryder, et al. (2020)* Presented at *NeurIPS*
#### Alignement tuning
- Instruction tuning aims to bridge the gap between the model’s original
objective — generating text — and user expectations, where users want the
model to follow their instructions and perform specific tasks.
**[Training language models to follow instructions with human
feedback](https://papers.nips.cc/paper/2022/hash/17f4c5f98073d1fb95f7e53f5c7fdb64-Abstract.html)**
*Long Ouyang, Jeffrey Wu, Xu Jiang, et al. (2022)* Presented at *NeurIPS*
- Strong alignment requires cognitive abilities such as understanding and
reasoning about agents’ intentions and their ability to causally produce
desired effects.
**[Strong and weak alignment of large language models with human
value](https://doi.org/10.1038/s41598-024-70031-3)** Khamassi, M., Nahon, M.
& Chatila, R. *Sci Rep** **14**, 19399 (2024).
### Prompt engineering
### ICL
#### ICL
In-context learning involves providing the model with specific information
without requiring additional training.
......@@ -209,7 +229,7 @@ without requiring additional training.
Methods in Natural Language Processing (EMNLP)* Location: Miami, Florida, USA
Published by: Association for Computational Linguistics
### CoT
#### CoT
Chain-of-thought is a prompting strategy that, instead of being limited to
input-output pairs, incorporates intermediate reasoning steps that serve as a
......@@ -229,7 +249,7 @@ solve problems.
Survey](https://arxiv.org/abs/2212.10403)** Jie Huang and Kevin Chen-Chuan
Chang (2023)* Published on *arXiv*
### RAG
#### RAG
Retrieval-Augmented Generation (RAG) is a prompting strategy that involves
integrating relevant information from external data sources into the
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment