Finetuning Transformers for Text Classification Issue

import torch
from datasets import load_dataset, load_metric
from transformers import AutoTokenizer, AutoModel, TrainingArguments, Trainer

# Load the CoLA dataset
cola_dataset = load_dataset("glue", "cola")

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased", num_labels=2 )

# Set the device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Define the training arguments
training_args = TrainingArguments(
    output_dir='./results',
    evaluation_strategy='steps',
    eval_steps=100,
    save_total_limit=1,
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=16,
    num_train_epochs=3,
    weight_decay=0.01,
    push_to_hub=False,
)

# Define the trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=cola_dataset['train'],
    eval_dataset=cola_dataset['validation'],
    tokenizer=tokenizer,
    compute_metrics=load_metric('glue', 'cola'),
)

# Fine-tune on the CoLA dataset
trainer.train()

In the code above I am getting

ValueError: You should supply an encoding or a list of encodings to this method that includes input_ids, but you provided ['label']

I was checking out the official notebook for text classification tutorial which uses yelp review dataset , and even though i’m using the trainer way and not pytorch training loop I tried renaming label to labels as mentioned. On the following final code : Google Colab (check modified code output error in the final cell)

ValueError: The model did not return a loss from the inputs, only the following keys: last_hidden_state,pooler_output. For reference, the inputs it received are input_ids,token_type_ids,attention_mask.

Can someone help with what exactly am i doing wrong here? Thanks!

You will have to tokenize your input before passing them to the Trainer API. This includes both the text and the target.

Something like this : Fine-tuning a model with the Trainer API - Hugging Face NLP Course

Hey thanks for helping, but even after adding that I have the same error popping up. I’ve updated the colab notebook , please take a look.