Trying to Fine Tune GPT2 Story Generator but do I need labels?

Hey,
I’m trying to fine tune gpt2 (small) on a custom dataset.
The idea is I will pass in objects and gpt2 should create a story out of the objects.

Example:
prompt:
house, cat, table
completion:
Once upon a time, there was a cozy house on the edge of a small town. In that house lived a mischievous cat who loved to jump onto the table and knock things over.

My Code Snippet:

    for i in range(epochs):
        for X, y, a in chatData:
            optim.zero_grad()
            loss = model(X, attention_mask=a, labels=y).loss #this is where I struggle.
            loss.backward()
            optim.step()

Variables:
X is the prompt e.g. “[BOS] house, cat, table [STORY START]”
y is the completion e.g. “the story about the objects. [EOS]”*

Of course I tokenized X and y and also added truncation and padding tokens (if necessary).

But is this the right approach? Because as more I search in the internet I find that people are putting the input = labels. They would probably do something like this (but I’m not sure):

X = [BOS] house, cat, table [STORY START] the story about the objects. [EOS]

loss = model(X, attention_mask=a, labels=X).loss

But is this correct for my specific use case as well?
I’m honestly a little bit confused right now:D