OPTIMIZING LLAMA 3.2 1B USING QUANTIZATION TECHNIQUES USING BITSANDBYTES FOR EFFICIENT AI DEPLOYMENT
Journal: International Journal of Advanced Research (Vol.13, No. 03)Publication Date: 2025-03-20
Authors : Neeraj Maddel Shantipal Ohol; Anish Khobragade;
Page : 78-88
Keywords : Large Language Models (LLMs) Quantization Techniques Post-Training Quantization (PTQ) Bitsandbytes LLaMA 3.2 1B Inference Efficiency;
Abstract
Large Language Models (LLMs) have transformed natural language processing, which has achieved state-of-the-art performance on various tasks. However, their high computational and memory requirements lead to significant challenges for deployment, especially on resource-constrained hardware. In this paper, we conduct a controlled experiment to optimize the LLaMA 3.2 1B model using post-training
Other Latest Articles
- CHARACTERIZATION OF GUT AND LIVER MICROBIOME ALTERATIONS IN A STREPTOZOTOCIN INDUCED RAT MODEL OF LIVER DAMAGE
- RESEARCH ON THE CONSTRUCTION OF TRAINING BASES IN LOCAL APPLIED COLLEGES AND UNIVERSITIES EMPOWERED BY THREE-CHAIN SYNERGY
- INFLUENCE OF COMPOSTING METHODS ON COMPOST MATURITY AND QUALITY
- IN VITRO EFFECT OF COPPER OXYCHLORIDE NANOPARTICLES ON FUSARIUM WILT DISEASE RESISTANCE IN SOLANUM LYCOPERSICUM THROUGH SEEDLING ROOT TREATMENT
- SMALL TOWN STORIES AND BIG BOX-OFFICE RETURNS
Last modified: 2025-04-08 16:34:54