Trying to Fine Tune GPT2 Story Generator but do I need labels?

VitalContribution · April 15, 2023, 6:23pm

Hey,
I’m trying to fine tune gpt2 (small) on a custom dataset.
The idea is I will pass in objects and gpt2 should create a story out of the objects.

Example:
prompt:
house, cat, table
completion:
Once upon a time, there was a cozy house on the edge of a small town. In that house lived a mischievous cat who loved to jump onto the table and knock things over.

My Code Snippet:

    for i in range(epochs):
        for X, y, a in chatData:
            optim.zero_grad()
            loss = model(X, attention_mask=a, labels=y).loss #this is where I struggle.
            loss.backward()
            optim.step()

Variables:
X is the prompt e.g. “[BOS] house, cat, table [STORY START]”
y is the completion e.g. “the story about the objects. [EOS]”*

Of course I tokenized X and y and also added truncation and padding tokens (if necessary).

But is this the right approach? Because as more I search in the internet I find that people are putting the input = labels. They would probably do something like this (but I’m not sure):

X = [BOS] house, cat, table [STORY START] the story about the objects. [EOS]

loss = model(X, attention_mask=a, labels=X).loss

But is this correct for my specific use case as well?
I’m honestly a little bit confused right now:D

Topic		Replies	Views
Key Error 'loss' while fine tuning GPT-2 with the Trainer utility 🤗Transformers	9	7470	May 10, 2022
Finetuning GPT2 with user defined loss Beginners	56	16096	July 23, 2023
Training GPT2 Text generation model with classification labels 🤗Transformers	0	639	December 7, 2022
Finetuned model generating test label exactly Beginners	0	462	October 15, 2020
Finetune GPT2 in tensorflow on custom data example programmatically Beginners	0	487	July 23, 2020

Trying to Fine Tune GPT2 Story Generator but do I need labels?

Related topics