Fine-Tuning T5v1.1 Using Trainer API

I just finished Chapter 3 of the HuggingFace course on fine-tuning a pre-trained model. To put my understanding to the test I was trying to fine-tune T5v1.1 on the downstream task of sentiment-analysis using the IMDB dataset.

I’m using Transformers v4.12.3 and Tokenizers 0.10.3.

Here’s my attempt at fine-tuning with the Trainer API:

# Load dataset
dataset = load_dataset("imdb")

# Load model
tokenizer = AutoTokenizer.from_pretrained("google/t5-v1_1-small")
model = AutoModelForSeq2SeqLM.from_pretrained("google/t5-v1_1-small")

# Tokenize and prepare data
def tokenize_function(example):
    return tokenizer(example["text"], padding=True, truncation=True)

tokenized_dataset = dataset.map(tokenize_function, batched=True)
data_collator = DataCollatorWithPadding(tokenizer)

# Define training arguments
training_args = TrainingArguments("test-trainer")

# Define trainer
trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_dataset["train"],
    data_collator=data_collator,
    tokenizer=tokenizer
)

# Train
trainer.train()

This throws a ValueError: not enough values to unpack (expected 2, got 1) in the second call to the forward method of transformers/models/t5/modeling_t5.py on line 906: batch_size, seq_length = input_shape.

The first call to forward goes as expected with input_shape == torch.Size([8, 512]) and the input_ids being the tokenized IMDB reviews. But the second call has an input_shape == torch.Size([8]) and input_ids == tensor([[0, 1, 1, 0 ... which seem to be the sentiment labels.

I suspect I’m preparing the dataset wrong, but I’m not sure how to fix it. Any help would be greatly appreciated! There is also this post which covers basically an identical error, but I couldn’t apply the fix to my own case. Thanks!