Hi Everyone, as everyone is aware of the power of GPT models by EleutherAI, be it GPT-Neo or GPT-J, they are pretty close with the original GPT model by open AI.
The text generation capability of the model is really good and the model is trained on larger dataset so it can easily be optimised for other tasks.
My Question is: What is the best in the case for other tasks, doing few show training to get desired output for NLP tasks or doing the proper training by fine-tuning the model with our own custom dataset for desired nlp task?
Also, if we increase the examples in few shot training approach will the response time also increases, because results are sometime not accurate and I think more training data is required?
As the model is higher in size, I think fine-tuning the model for desired tasks takes more hardware as one model is going to do only on task and if I want more nlp tasks like summarization, entity recognition, intent, etc I will need to have 4-5 separate models for the tasks and if one occupy 40GB space in RAM, so 4-5 will take 5 to 5 times the space and required specialised hardware requirements.
if anyone knows or worked on this model can share their experience and suggestion!