Using GPT-J for custom sequence classification

adamcovariance · September 14, 2022, 9:32pm

Hello, I’d like to use GPT-J to classify ~200 character text strings. I have a labeled dataset with ~1000 examples of each class. I have experience with Tensorflow and Pytorch but am new to HuggingFace. I assume I will need to use transformers.GPTJForSequenceClassification for this task. I have a few beginner questions:

In my (perhaps naive) understanding, I would need to finetune this model to fit my classification task. Finetuning should not need to compute gradients through the entire model, just the classification output, so I should be able to train this on my 16GB GPU, no? (I suspect I’m wrong based on what I’ve read, but why?)

Does the model keep track of which layers are frozen during finetuning, or do I need to configure this?

Is there a good resource / tutorial for finetuning GPT-J specifically for sequence classification?

Thanks!

Topic		Replies	Views
Finetune GPT-J on custom dataset Models	0	2807	January 18, 2022
GPT-J-6B - Fine Tuning 🤗Transformers	0	317	September 22, 2021
Finetuning GPT for classification task fails Beginners	0	422	November 18, 2021
How to fine-tune "openai-gpt" model for sequence classification? 🤗Transformers	3	1353	September 5, 2024
GPT_J custom dataset Beginners	1	169	November 9, 2022

Using GPT-J for custom sequence classification

Related topics