Generate desired text output based on model training

I’m just started learning AI/ML. Let me explain what I’m trying to achieve here.

To keep it simple initially,
I need to train a model with some combinations of ‘input text’ => ‘expected output text’ pairs.
Post training, if I query the model with input text same as what I had trained it with, I’m expecting the model to generate the same expected output.

I’m loading my combinations of ‘Input text’ and ‘Expected output’ pairs from a json file.
Then I feed these inputs to train my model.

The train dataset has ‘input_ids’ with tokens of the ‘input text’.
It also has a ‘labels’ with tokens of the ‘Expected output text’.
The ‘attention_mask’ in the training dataset is also correct you can assume.

The training using Trainer and TrainerArguments runs successfully with 3-5 epochs.

Post training,
I’m taking one of the trained ‘input text’ and passing it to the same model using the same tokenizer.
The tensor tokens generated for the ‘input text’ are matching exactly with the input_ids from training dataset.
I’m calling the model.generate() after passing the appropriate parameters with an intent to get the training labels text as expected output.

But instead its generating some other random text.

I have decided to go with ‘gpt2’ model for this.
I’m using GPT2Tokenizer tokenizer. Alongwith that I’m using GPT2LMHeadModel as my model.

Example input texts and Expected output label:
“I am having Fever, headache and fatigue.” => “You maybe suffering from Influenza.”

But when i give input after training like “I am having Fever, headache and fatigue.”
it gives some random text like

"I have had this problem for a long time. I have never had a fever or headache, but this is the first time in my life that I’ve had it. It has been a while since I had any symptoms, and I’m not sure if it’s because of the cold, or if I just have a cold or something else.

The first thing I noticed was that "

Am I using the correct approach with this?
Do I need to change my model here?
Or any specific training parameters I need to use?

Someone kindly share your comments for the same, Thanks.

1 Like

Hey, trying to use GPT-2 to map specific input texts to specific outputs might not work out as you expect. it is mainly designed to generate coherent text by predicting the next word in a sequence based on the previous words. It’s awesome for general text generation, but it’s not really built for tasks where you need a direct mapping from one piece of text to another, especially when the output isn’t just a continuation of the input.

For what you’re trying to do—IMO—a sequence-to-sequence model would be a better fit. These models are made for tasks where the output is a transformation of the input, like translation, summarz, or any kind of text-to-text mapping.

1 Like
1 Like

Thank you so much @thesab . I can try to work with some of the sequence-sequence models.

But there is a minor change in my requirement. I just wanted to know if the model can be used for the same.

Instead of mapping a text to another text during the training, I would need to map a set of keywords to a text.
So I can then train the model with different combinations of set of keywords mapping to the same text.

Eg. Keywords(A, B, C) => Sentence 1
Keywords(B, A, C) => Sentence 1
Keyword(C, A, B) => Sentence 1
Keyword(D, A) => Sentence 2
Keyword(A, D) => Sentence 2
and so on.

So will the sequence-to-sequence model be useful with this use case as well ?
Is this achievable?

Also will you be able to confirm the actual model, tokenizer , Trainer & TrainerArguments that I can use to fulfill this use case?

Thanks again for sharing your inputs. :slight_smile:

1 Like