ResearchBib Share Your Research, Maximize Your Social Impacts
Sign for Notice Everyday Sign up >> Login

OPTIMIZING LLAMA 3.2 1B USING QUANTIZATION TECHNIQUES USING BITSANDBYTES FOR EFFICIENT AI DEPLOYMENT

Journal: International Journal of Advanced Research (Vol.13, No. 03)

Publication Date:

Authors : ; ;

Page : 78-88

Keywords : Large Language Models (LLMs) Quantization Techniques Post-Training Quantization (PTQ) Bitsandbytes LLaMA 3.2 1B Inference Efficiency;

Source : Downloadexternal Find it from : Google Scholarexternal

Abstract

Large Language Models (LLMs) have transformed natural language processing, which has achieved state-of-the-art performance on various tasks. However, their high computational and memory requirements lead to significant challenges for deployment, especially on resource-constrained hardware. In this paper, we conduct a controlled experiment to optimize the LLaMA 3.2 1B model using post-training

Last modified: 2025-04-08 16:34:54