Need Help in creating ai chatbot for my app

John6666 · August 8, 2025, 8:16am

Yeah. But quantization at runtime (on-the-fly quantization) is not that difficult. Of course, it becomes difficult if you are particular about accuracy and speed…
Also, there are many high-quality files available on Hub for the GGUF format, so it would be a good idea to search for them.

# pip install -U bitsandbytes
from transformers import pipeline, BitsAndBytesConfig
import torch

# https://huggingface.co/blog/4bit-transformers-bitsandbytes
nf4_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16)

pipe = pipeline(
    "image-text-to-text",
    model="google/gemma-3n-e4b-it",
    torch_dtype=torch.bfloat16,
    quantization_config=nf4_config,
)

Topic		Replies	Views
Finetune a chatbot for specfic task Beginners	0	798	June 10, 2023
Answer template generation from question 🤗Transformers	0	215	November 11, 2023
Conversational AI + question answering model Intermediate	5	2686	January 30, 2023
FastLoRAChat Instruct-tune LLaMA on consumer hardware with shareGPT data Show and Tell	0	676	April 19, 2023
Mistral 7B RAG Langchaing Models	0	2642	February 20, 2024

Need Help in creating ai chatbot for my app

Related topics