diff --git a/README.md b/README.md index ea4b8c28214c458de2789c838e2dd2e65cc8bdf2..1b69fc7b411ba3630f768824ae2ec64032111203 100644 --- a/README.md +++ b/README.md @@ -44,6 +44,19 @@ to generative AAMAS. This list is a work in progress and will be regularly updat Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton (2012) Presented at *NeurIPS* +- Quantization, and distillation are two popular techniques for deploying deep + learning models in resource-constrained environments by making these models + more lightweight without compromising too much on performance. + + **[A survey of quantization methods for efficient neural + network inference](https://www.crcpress.com/Low-Power-Computer-Vision/Gholami-Kim-Dong-Yao-Mahoney-Keutzer/p/book/9780367707095)** + Amir Gholami, Sehoon Kim, Zhen Dong, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer (2022) + Published in *Low-Power Computer Vision*, Chapman and Hall/CRC, pp. 291–326. + + **[Knowledge Distillation: A Survey](https://doi.org/10.1007/s11263-021-01453-z)** + Jianping Gou, Baosheng Yu, Stephen J. Maybank, Dacheng Tao (2021) + Published in *International Journal of Computer Vision*, Volume 129, pp. 1789–1819. + ## Large Language Models - The literature review of the recent advances in LLMs shown that scaling can