Zero-shot classification fine-tuning

juancavallotti · February 2, 2022, 4:29am

Hello! I’m trying to figure out how to fine-tune bert-base-uncased to perform zero-shot classification. I managed to get it working but I’m not sure I’m preparing the data in the right way:

I initialize my model with the problem_type="multi_label_classification" setting so it uses a sigmoid loss function as opposed to softmax.

Then I prepare my data in the following way

I tokenize the input string and the label together using tokenizer(sentence,label, truncation=True) and save it in the input_ids field of my dataset.
Then I also convert the label as a one hot vector and keep it in the labels field of my dataset.

Finally I ran the training with about 10k lines of annotated data, but the results were kind of nonesense, not better than the untrained model.

Am I on the right path? Should I keep this as a self-supervised fine-tuning task?
Thanks a lot for your help!

ShieldHero · March 4, 2022, 8:29am

Can you tell me how you are finetuning it? The code snippet part of it.

juancavallotti · March 18, 2022, 3:33pm

Hi!

Thanks for your reply! I ended up reading the zero-shot classification paper and I realized I had the concept wrong. I ended up succeeding at my training.

Topic		Replies	Views
Fine tune Zero-shot classification on multi-label dataset Models	4	3581	November 30, 2023
Fine tuned multiclass model Beginners	8	1607	November 11, 2023
Predicting On New Text With Fine-Tuned Multi-Label Model Beginners	4	5157	December 23, 2021
Fine tunning for zero shot on tensorflow Beginners	0	261	September 16, 2022
How to do sequence fine tuning? Beginners	5	742	July 22, 2020

Zero-shot classification fine-tuning

Related topics