Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
LLM4AAMAS
Manage
Activity
Members
Labels
Plan
Issues
0
Issue boards
Milestones
Wiki
Code
Merge requests
0
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Maxime Morge
LLM4AAMAS
Commits
90488997
Commit
90488997
authored
1 month ago
by
Maxime MORGE
Browse files
Options
Downloads
Patches
Plain Diff
LLM4AAMAS: Addkumar25arxiv
parent
7216871c
No related branches found
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+61
-41
61 additions, 41 deletions
README.md
with
61 additions
and
41 deletions
README.md
+
61
−
41
View file @
90488997
...
...
@@ -106,43 +106,6 @@ to generative AAMAS. This list is a work in progress and will be regularly updat
Machine Translation](https://arxiv.org/abs/1406.1078)** *Kyunghyun Cho,
Bartvan Merrienboer, Caglar Gulcehre, et al. (2014)* Published on *arXiv*
## Tuning
### Instruction tuning
-
The fine-tuning of a pre-trained language model requires significantly fewer
data and computational resources, especially when parameter-efficient
approaches such as Low-Rank Adaptation (LoRA) are used.
**[LoRA: Low-Rank Adaptation of Large Language
Models](https://arxiv.org/abs/2106.09685)** Edward J. Hu, Yelong Shen,
Phillip Wallis, et al. (2021)* Published on *arXiv*
-
The apparent mastery of textual understanding by LLMs closely resembles human
performance.
**[Language Models are Few-Shot
Learners](https://papers.nips.cc/paper/2020/file/fc2c7f9a3f3f86cde5d8ad2c7f7e57b2-Paper.pdf)**
Tom Brown, Benjamin Mann, Nick Ryder, et al. (2020)* Presented at *NeurIPS*
### Alignement tuning
-
Instruction tuning aims to bridge the gap between the model’s original
objective — generating text — and user expectations, where users want the
model to follow their instructions and perform specific tasks.
**[Training language models to follow instructions with human
feedback](https://papers.nips.cc/paper/2022/hash/17f4c5f98073d1fb95f7e53f5c7fdb64-Abstract.html)
**
*Long Ouyang, Jeffrey Wu, Xu Jiang, et al. (2022)*
Presented at
*NeurIPS*
-
Strong alignment requires cognitive abilities such as understanding and
reasoning about agents’ intentions and their ability to causally produce
desired effects.
**[Strong and weak alignment of large language models with human
value](https://doi.org/10.1038/s41598-024-70031-3)** Khamassi, M., Nahon, M.
& Chatila, R. *Sci Rep** **14**, 19399 (2024).
## Existing LLMs
Many models are available at the following URLs:
...
...
@@ -196,9 +159,66 @@ Many models are available at the following URLs:
Published in
*
Advances in Neural Information Processing Systems (NeurIPS
2023)
*
## Prompt engineering
## Post-training
This survey explores post-training methodologies for LLMs, including
fine-tuning, reinforcement learning, and scaling techniques like LoRA and RAG.
Fine-tuning improves task-specific performance but risks overfitting and high
computational costs. Test-Time Scaling (TTS) optimizes inference dynamically
without updating the model, making it suitable for tasks with flexible
computational budgets. Pretraining and TTS serve different purposes—pretraining
enhances fundamental capabilities through extensive training, while TTS improves
performance at inference time. Pretraining is crucial for novel tasks requiring
new skills, whereas TTS is effective when base models already perform reasonably
well.
-
**[LLM Post-Training: A Deep Dive into Reasoning Large Language Models](https://arxiv.org/abs/2502.21321)**
Komal Kumar, Tajamul Ashraf, Omkar Thawakar, Rao Muhammad Anwer, Hisham
Cholakkal, Mubarak Shah, Ming-Hsuan Yang, Phillip H. S. Torr, Salman Khan,
Fahad Shahbaz Khan (2025) on
*arXiv*
(cs.CL).
### Tuning
#### Instruction tuning
-
The fine-tuning of a pre-trained language model requires significantly fewer
data and computational resources, especially when parameter-efficient
approaches such as Low-Rank Adaptation (LoRA) are used.
**[LoRA: Low-Rank Adaptation of Large Language
Models](https://arxiv.org/abs/2106.09685)** Edward J. Hu, Yelong Shen,
Phillip Wallis, et al. (2021)* Published on *arXiv*
-
The apparent mastery of textual understanding by LLMs closely resembles human
performance.
**[Language Models are Few-Shot
Learners](https://papers.nips.cc/paper/2020/file/fc2c7f9a3f3f86cde5d8ad2c7f7e57b2-Paper.pdf)**
Tom Brown, Benjamin Mann, Nick Ryder, et al. (2020)* Presented at *NeurIPS*
#### Alignement tuning
-
Instruction tuning aims to bridge the gap between the model’s original
objective — generating text — and user expectations, where users want the
model to follow their instructions and perform specific tasks.
**[Training language models to follow instructions with human
feedback](https://papers.nips.cc/paper/2022/hash/17f4c5f98073d1fb95f7e53f5c7fdb64-Abstract.html)
**
*Long Ouyang, Jeffrey Wu, Xu Jiang, et al. (2022)*
Presented at
*NeurIPS*
-
Strong alignment requires cognitive abilities such as understanding and
reasoning about agents’ intentions and their ability to causally produce
desired effects.
**[Strong and weak alignment of large language models with human
value](https://doi.org/10.1038/s41598-024-70031-3)** Khamassi, M., Nahon, M.
& Chatila, R. *Sci Rep** **14**, 19399 (2024).
### Prompt engineering
### ICL
###
#
ICL
In-context learning involves providing the model with specific information
without requiring additional training.
...
...
@@ -209,7 +229,7 @@ without requiring additional training.
Methods in Natural Language Processing (EMNLP)
*
Location: Miami, Florida, USA
Published by: Association for Computational Linguistics
### CoT
###
#
CoT
Chain-of-thought is a prompting strategy that, instead of being limited to
input-output pairs, incorporates intermediate reasoning steps that serve as a
...
...
@@ -229,7 +249,7 @@ solve problems.
Survey
](
https://arxiv.org/abs/2212.10403
)
**
Jie Huang and Kevin Chen-Chuan
Chang (2023)
* Published on *
arXiv
*
### RAG
###
#
RAG
Retrieval-Augmented Generation (RAG) is a prompting strategy that involves
integrating relevant information from external data sources into the
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment