Error When Trying to Finetune Llama 2 Chat 13B

jscode13 · October 2, 2023, 4:12pm

dataset = load_dataset(dataset_name, split=“train”)

compute_dtype = getattr(torch, bnb_4bit_compute_dtype)

bnb_config = BitsAndBytesConfig(
load_in_4bit=use_4bit,
bnb_4bit_quant_type=bnb_4bit_quant_type,
bnb_4bit_compute_dtype=compute_dtype,
bnb_4bit_use_double_quant=use_nested_quant,
)

if compute_dtype == torch.float16 and use_4bit:
major, _ = torch.cuda.get_device_capability()
if major >= 8:
print(“=” * 80)
print(“Your GPU supports bfloat16: accelerate training with bf16=True”)
print(“=” * 80)

model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map=device_map
)
model.config.use_cache = False
model.config.pretraining_tp = 1

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = “right”

peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias=“none”,
task_type=“CAUSAL_LM”,
)

training_arguments = TrainingArguments(
output_dir=output_dir,
num_train_epochs=num_train_epochs,
per_device_train_batch_size=per_device_train_batch_size,
gradient_accumulation_steps=gradient_accumulation_steps,
optim=optim,
save_steps=save_steps,
logging_steps=logging_steps,
learning_rate=learning_rate,
weight_decay=weight_decay,
fp16=fp16,
bf16=bf16,
max_grad_norm=max_grad_norm,
max_steps=max_steps,
warmup_ratio=warmup_ratio,
group_by_length=group_by_length,
lr_scheduler_type=lr_scheduler_type,
report_to=“tensorboard”
)

trainer = SFTTrainer(
model=model,
train_dataset=dataset,
peft_config=peft_config,
dataset_text_field=“text”,
max_seq_length=max_seq_length,
tokenizer=tokenizer,
args=training_arguments,
packing=packing,
)

trainer.train()

trainer.model.save_pretrained(new_model)

ValueError Traceback (most recent call last)
in <cell line: 32>()
30
31 # Load LLaMA tokenizer
—> 32 tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
33 tokenizer.pad_token = tokenizer.eos_token
34 tokenizer.padding_side = “right”

4 frames
/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_fast.py in init(self, *args, **kwargs)
118 fast_tokenizer = convert_slow_tokenizer(slow_tokenizer)
119 else:
→ 120 raise ValueError(
121 “Couldn’t instantiate the backend tokenizer from one of: \n”
122 “(1) a tokenizers library serialization file, \n”

ValueError: Couldn’t instantiate the backend tokenizer from one of:
(1) a tokenizers library serialization file,
(2) a slow tokenizer instance to convert or
(3) an equivalent slow tokenizer class to instantiate and convert.
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

This code worked fine with the 7B version of this model. All I did was change it to the 13B version and now I’m getting this error. Please let me know what you think is wrong. Thanks.

Topic		Replies	Views
Error loading Llama model Beginners	5	1572	March 9, 2024
A fine tuned Llama2-chat model can't answer questions from the dataset 🤗Transformers	0	309	December 20, 2023
Finetuning 4bit model Beginners	1	2428	August 29, 2023
ValueError: Unable to create tensor, you should probably activate truncation and/or padding with 'padding=True' 'truncation=True' to have batched tensors with the same length,medalpaca & lora Beginners	8	610	January 9, 2024
Getting Error when Finetuning Llama2 via Qlora in FSDP 🤗Accelerate	0	1268	October 2, 2023

Error When Trying to Finetune Llama 2 Chat 13B

Related topics