Hello there,
I am trying to fine-tune an XLM-R model using PEFT, but it seems the training speed is slower using PEFT than just fine-tuning the full model. Is it the expected behaviour?
Code sample
dataset = load_dataset(dataset_name)
device = "cuda:0" if (torch.cuda.is_available() and not args.cpu) else "cpu"
print(f"Training on {device} {torch.cuda.is_available()}")
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(
checkpoint,
problem_type="regression",
num_labels=len(CATEGORIES),
)
max_length = (
tokenizer.max_model_input_sizes["xlm-roberta-base"]
if "xlm" in checkpoint.lower()
else model.max_length
)
print(f"Truncating to {max_length} tokens")
tokenized_datasets = dataset.map(
lambda samples: tokenizer(
samples["text"],
padding="longest",
return_tensors="pt",
),
batched=True,
batch_size=512,
num_proc=12,
)
# Evaluate every 20% of training set.
steps_by_evaluation = int(
dataset["train"].shape[0] / config["training_batch_size"] / 5
)
print(f"Evaluating every {steps_by_evaluation}")
# Lora part
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)
lora_config = LoraConfig(
r=8,
lora_alpha=32,
inference_mode=False,
lora_dropout=0.05,
bias="none",
task_type="SEQ_CLS",
)
model = get_peft_model(model, lora_config).to(device)
print(model.print_trainable_parameters())
training_args = TrainingArguments(
evaluation_strategy="steps",
save_strategy="steps",
eval_steps=steps_by_evaluation,
save_steps=steps_by_evaluation,
learning_rate=2e-5,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
num_train_epochs=3,
weight_decay=1,
warmup_ratio=0.1,
seed=42,
)
trainer = MultilabelTrainer(
model,
training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
tokenizer=tokenizer,
)
trainer.train()
While I am here, I have a second question. Following the example notebook using IA3 instead of Lora, I get the following error:
ia3 ValueError: Please specify `target_modules` in `peft_config`
I can’t find anywhere in the doc an example for IA3 config using target_modules. Moreover, the example notebook doesn’t seem to work…
Thank you in advance for your time.