Hi all,
I would like to know which initial model I could use in order to train a model capable of generating lists of numbers starting from a prompt in natural language.
Examples:
“Create a list of 3 prime numbers” → 1, 3, 5
“Create a list of 5 random numbers” → 5, 7, 19, 22, 24
“Create a list of 2 negative numbers” → -3, -5
etc.
From what I’ve read, the recommended models would be those based on “text-generation”, but before proceeding I would like to have further confirmation and maybe understand if eg. a GPT2 model might be fine or if there are others, even lighter.
Based on the examples you gave, where there is a clear separation between the input and the output, you can also consider “encoder-decoder” language models, like T5. These models have a text2text-generation tag.
One more note: with tasks where you want the model to follow an instruction, consider starting from an instruction-tuned model like Flan-T5 or BLOOMZ These models have been fine-tuned to follow instructions and tend to behave better on this kind of tasks.
Hi Joao,
I was able to load and query the “Flan-T5” model both locally and using the API.
I was also able to write a code that queries locally the model from a CSV file built like that:
input_text;target_text
“Write a list of numbers”;“1, 2, 3”
“Write 5 numbers”;“3, 6, 8, 2, 1”
etc.
But I wasn’t able to find any code sample to train the model, is there any?
I am using Python on win10.
Now my question is, how should I change this instruction to read the CSV file and use the FLAN-T5-SMALL model?
Of course I will change the model name in flan-t5-small but how to load the CSV and also if a source prefix is needed.
Thank you very much for your help!