Finetuning GPT-J6B for custom dataset

Syed313 · September 13, 2021, 6:54am

How to prepare the dataset to feed GPT-J6B for finetuning.
Any steps or tutorial is appreciated. Thanks

dynamicwebpaige · March 6, 2022, 5:30am

Hi, @Syed313! Thanks for the question.

@deniskamazur modified EleutherAI’s GPT-J 6B model, so you can generate and fine-tune it in Colab or on an equivalent desktop GPU (e.g. single 1080Ti).

The proof of concept notebook is available here.

As you are probably already aware: the original GPT-J takes 22+ GB memory for float32 parameters; and even if you cast everything to 16-bit, it will still not fit onto most single-GPU setups short of A6000 and A100. You can inference it on TPU or CPUs, but fine-tuning is a bit more expensive. This implementation should be a bit more cost-effective.

Topic		Replies	Views
Fine-tuning GPT-J for conversations Beginners	2	5077	January 15, 2023
Fine tune large model on a single gpu Models	0	320	November 30, 2022
Fine-Tune GPT-2 Spanish From Example Notebook OOM Beginners	0	669	December 17, 2020
Is it possible to do inference on gpt-j-6B via Colab? Beginners	1	1129	December 21, 2021
Issues running GPT-J-6B Beginners	1	1120	January 31, 2023

Finetuning GPT-J6B for custom dataset

Related topics