Why would anyone apply few-shot GPT learning if they have access to the GPT model and they can do fine-tuning instead?

neo-benjamin · November 17, 2022, 7:09pm

Lets say we got the GPT-3 model from OpenAI. (I know GPT-3 is closed source)

Then we can do fine-tune the GPT-3 model.

In that case what would be the difference between fine-tune vs few-shot learning.

The case for few-shot learning is it does not need the model to be trained and just need to show a few examples.

But pre-training is easy as well. We just need to train on a smaller dataset. Also what I understand is for pre-training you dont need a complex GPU setup either.

So if someone had access to GPU, why would he ever used few-shot learning?

If they get the GPT-3 model weights can anyone would be able to fine-tune it if they have access to a couple of RTX 3080 GPU?

Or would it need setup like the big companies?

Topic		Replies	Views
Finetuning and single-GPU utilization 🤗Transformers	0	489	August 19, 2021
Few-shot learning vs Fine-Tuning Research	0	1788	May 26, 2023
GPT2 training examples 🤗Transformers	0	303	October 29, 2021
Fine-tuning GPT-J for conversations Beginners	2	5076	January 15, 2023
Is it Possible to use a pre-trained model without fine-tuning it? Beginners	0	225	August 17, 2023

Why would anyone apply few-shot GPT learning if they have access to the GPT model and they can do fine-tuning instead?

Related topics