[Solved]Empty Card When using c4 Dataset during Quantization wiht GPTQ

mahihossain666 · December 15, 2023, 5:22pm

Hi!
I am trying to quantize Llama-2, and using this:

os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512"
 
model_id = "meta-llama/Llama-2-70b-hf"
dataset = "wikitext2"

bits = 4
 
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=False, cache_dir="./models")
gptq_config = GPTQConfig(bits=4, dataset=dataset, tokenizer=tokenizer)

But I want to use c4 instead of wikitext2. With c4 and c4-new I is saying something like:

Repo card metadata block was not found. Setting CardData to empty.

And then during quantization throwing error. But everything works perfectly if I use “wikitext2”. Is there a reason why I can not use c4?

Thanks and regards,
Mahi

Solved: Turns out installing optimum and transformers from the git repo and upgrading accelerate solves it!

Topic		Replies	Views
GPU Optimisation Quantised Llama x Nvidia T4 Beginners	2	209	January 8, 2025
CUDA out of Memory even on a RTX 4070 Super Models	4	117	December 31, 2024
Llama-2 on colab Beginners	3	11363	November 28, 2023
CUDA out of memory on multi-GPU 🤗Transformers	1	2643	March 6, 2024
ValueError: model.embed_tokens.weight doesn't have any device set 🤗Transformers	5	6518	December 29, 2023

[Solved]Empty Card When using c4 Dataset during Quantization wiht GPTQ

Related topics