How to prepare the dataset to feed GPT-J6B for finetuning.
Any steps or tutorial is appreciated. Thanks
Hi, @Syed313! Thanks for the question.
@deniskamazur
modified EleutherAI’s GPT-J 6B model, so you can generate and fine-tune it in Colab or on an equivalent desktop GPU (e.g. single 1080Ti).
The proof of concept notebook is available here.
As you are probably already aware: the original GPT-J takes 22+ GB memory for float32
parameters; and even if you cast everything to 16-bit, it will still not fit onto most single-GPU setups short of A6000 and A100. You can inference it on TPU or CPUs, but fine-tuning is a bit more expensive. This implementation should be a bit more cost-effective.
1 Like