Using GPT-J models for many NLP tasks

sridharnlp · November 21, 2022, 7:24am

Hi Everyone, as everyone is aware of the power of GPT models by EleutherAI, be it GPT-Neo or GPT-J, they are pretty close with the original GPT model by open AI.

The text generation capability of the model is really good and the model is trained on larger dataset so it can easily be optimised for other tasks.

My Question is: What is the best in the case for other tasks, doing few show training to get desired output for NLP tasks or doing the proper training by fine-tuning the model with our own custom dataset for desired nlp task?

Also, if we increase the examples in few shot training approach will the response time also increases, because results are sometime not accurate and I think more training data is required?

As the model is higher in size, I think fine-tuning the model for desired tasks takes more hardware as one model is going to do only on task and if I want more nlp tasks like summarization, entity recognition, intent, etc I will need to have 4-5 separate models for the tasks and if one occupy 40GB space in RAM, so 4-5 will take 5 to 5 times the space and required specialised hardware requirements.

if anyone knows or worked on this model can share their experience and suggestion!

Topic		Replies	Views
Fine-tuning GPT-J for conversations Beginners	2	5098	January 15, 2023
Fine-tune, or train from scratch? Beginners	6	3526	September 16, 2020
Finetune GPT-J on custom dataset Models	0	2812	January 18, 2022
Trying to choose a model/methodology (text generation) Beginners	0	420	April 14, 2021
Resource required to fine tune a large model? Beginners	0	416	November 12, 2022

Using GPT-J models for many NLP tasks

Related topics