Multi-Label Model - Labels per part of sentence

TardisPilot · March 8, 2023, 8:12pm

Hi All! I have fine tuned and trained a multi-label model using BERT base-uncased. I was asked to output predictions from the model not just by the top-k labels predicted for the text, but by top-k and the part of the text associated with the label.

Is this possible? I’ve seen the outputs of NER models that show the token the prediction was based on. Is it possible to do the same/similar thing for a multi-label model?

An output example might be:

text = “I am paid well and have a great team”
pipe(text)

‘label’: ‘Pay’,
‘score’: 0.98,
‘tokens’: {‘I’, ‘am’, ‘paid’}
etc.

I greatly appreciate any guidance/suggestions on how to get this data out.

Eitanli · March 9, 2023, 12:12pm

Hey Tardis,
Interesting problem.
Few questions, how long are your input texts? how long should the output tokens be?
I did something similar ones, by first tokenize the text by sentences (available in NLTK) and then applying the multi-label model on each of them. Then filter by probability threshold to get final list of labels by merging them all. You could do the same on each sentence and it would be that part of text.
Eitan

Topic		Replies	Views
Model gives output even for SEP token Models	0	481	February 1, 2023
Multi-label sequence labeling (for e.g., multi-label NER) 🤗Transformers	0	1523	November 21, 2022
BERT for NER output of only '0' Beginners	0	670	November 14, 2021
BERT Multilabel - Different Training Dataset For Each Label? Intermediate	3	1302	December 27, 2021
Best solution to train multiclass model Beginners	0	306	March 30, 2022

Multi-Label Model - Labels per part of sentence

Related topics