Hello, I am new to the huggingface library and I am currently going over the course.
I want to finetune a BERT model on a dataset (just like it is demonstrated in the course), but when I run it, it gives me +20 hours of runtime.
I therefore tried to run the code with my GPU by importing torch, but the time does not go down.
However, in the course, it says it should only take a couple of minutes with a GPU.
Can someone explain what I am doing wrong? I have a NVIDEA RTX 2060, 16GB of DDR4 RAM and an AMD RYZEN 7.
from datasets import load_dataset
from transformers import AutoTokenizer, DataCollatorWithPadding
raw_datasets = load_dataset("glue", "mrpc")
checkpoint = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
def tokenize_function(example):
return tokenizer(example["sentence1"], example["sentence2"], truncation=True)
tokenized_datasets = raw_datasets.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer=tokenizer)
from transformers import TrainingArguments
training_args = TrainingArguments("test-trainer")
from transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
import torch
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
model = model.to(device)
device #This outputs 'cuda'
from transformers import Trainer
trainer = Trainer(
model,
training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
data_collator=data_collator,
tokenizer=tokenizer
)
trainer.train()
The code comes almost directly from this link: Fine-tuning a pretrained model - Hugging Face Course
When running, I have a really small peak in my GPU in my task manager at the beginning, but then the load of the GPU returns to 0.
Thank you in advance!