I am trying the example: Google Colab
The only thing I did - I added data_collator:
from transformers import DataCollatorWithPadding
data_collator =
DataCollatorWithPadding(tokenizer=tokenizer)
trainer = SFTTrainer(
model=model,
tokenizer=tokenizer,
data_collator=data_collator,
train_dataset=train_dataset,
dataset_text_field="text",
max_seq_length=max_seq_length,
dataset_num_proc=2,
packing=False, # Can make training 5x faster for short sequences.
args=TrainingArguments(
per_device_train_batch_size=2,
gradient_accumulation_steps=4,
warmup_steps=5,
max_steps=60, # Set num_train_epochs = 1 for full training runs
learning_rate=2e-4,
fp16=not torch.cuda.is_bf16_supported(),
bf16=torch.cuda.is_bf16_supported(),
logging_steps=1,
optim="adamw_8bit",
weight_decay=0.01,
lr_scheduler_type="linear",
seed=3407,
output_dir="outputs",
),
)
But I am getting error ValueError: The model did not return a loss from the inputs, only the following keys: logits. For reference, the inputs it received are input_ids,attention_mask.
on calling trainer.train()