How to select labels for multilabel zero-shot text classification

Krzysztof · July 28, 2021, 10:30am

Hi, I am using transformers pipeline for zero-shot classification on a large set of more than 1m student reviews of courses conducted in the US and the UK. Example of one review is below:
“Very nice woman, extremely helpful if you go to her office hours, but scheme is a stupid language which makes this class boring and difficult. Very tough exams. Do good on your labs and projects and you’ll be okay.”
I read that choosing proper labels for zero-shot classification, with many domain-specific words, is key. Can you suggest general rules how to create such labels, they should be long or short, single domain specific words or complex sentences. For example there could be such approaches as:
1.
candidate_labels = [“teaching skills”, “interpersonal skills”, “grading fairness”]
or
2.
candidate_labels = [“teacher or professor teaching skills”, “teacher or professor interpersonal skills”, “course grading fairness”]
3.
candidate_labels = [“teacher or professor good or bad teaching skills”, “teacher or professor good or bad interpersonal skills”, “course grading fair or unfair”]
I do not have a labelled test set to compare the accuracy, and I would like to avoid labeling a test set, as it is tedious.
Any suggestions? Maybe there are some papers that deal with this problem?

tsleolima · July 10, 2022, 5:51pm

Hello, did you find out how the condadidate labels work?

jgcoke · November 21, 2022, 5:11pm

Does anyone have a few basic guidelines to get the labels? I think the main point to improve zero-shot learning is to select good labels, but I didn’t find any resource that addresses this problem.
I was thinking of measuring the distance between the label embeddings and trying to change the wording to maximize it, which could help to improve the accuracy.

mabu · November 21, 2022, 6:13pm

Hi, probably not the answer you’re looking for but SetFit is a great alternative to the zero-shot pipeline. It can work without labelling data at all, or label as little as 8 examples. This helps with label calibration, although it doesn’t completely avoid label engineering, it does improve model performance. Checkout a training example here, which also compares to the zero-shot pipeline.

Alternatively, you may wanna settle for some labels and then compare models. I have found for example that roberta-large-mnli worked a lot better than the default bart-large-mnli on the same labels.

miOmiO · November 23, 2022, 9:00pm

Hi @mabu,

I tried to use setfit to perform zero shot clasiification following this link:

setfit/zero-shot-classification.ipynb at main · huggingface/setfit (github.com), however the example uses hold-out dataset for evaluation, my case requires to output inference with non-labeled data, currently I have trainer readay :

from setfit import SetFitModel, SetFitTrainer

model = SetFitModel.from_pretrained(“sentence-transformers/paraphrase-mpnet-base-v2”)
trainer = SetFitTrainer(
model=model,
train_dataset=train_dataset
)
trainer.train()

Can I directly use trainer for inference? Instead of pushing it to HF hub and reload the model from hub. If yes, could you give me example what next step is? Thank you!

mabu · November 24, 2022, 2:19pm

@miOmiO yes!

You can do it in a couple of ways.
Via __call__: trainer.model([sequence_to_classify])
Via predict: trainer.predict([sequence_to_classify])
And I think this is also possible: trainer.model.predict_proba([sequence_to_classify])

miOmiO · November 28, 2022, 5:03pm

Thank you so much, just try and it works. @mabu

Topic		Replies	Views
Fine tune Zero-shot classification on multi-label dataset Models	4	3581	November 30, 2023
New pipeline for zero-shot text classification 🤗Transformers	107	71687	February 17, 2025
Is it possible to get labels instead of pre-defining it in zero-shot classification? Beginners	0	327	February 21, 2021
Appropriate category names for multilingual zero shot classifier Beginners	0	70	June 18, 2024
Fine tuned multiclass model Beginners	8	1607	November 11, 2023

How to select labels for multilabel zero-shot text classification

Related topics