Hi @murdockthedude. Iām using some sensitive (biomedical) data and my use case is actually a little more complicated than ājustā multi-label NER, so Iād have to make up some dummy data and simplify my notebook a bit. Let me think more about that and how to make a shareable notebook.
In the mean time, the answer to your first two questions.
The customer trainer is all thatās needed, although you probably also want to implement a special compute_metrics function and use that when you instantiate the trainer so you can do early stopping.
In my case I used BertForTokenClassification but using AutoModelForTokenClassification should be fine, I think.
Hey @drussellmrichie, totally understand, thank you. Iāll try to get a small notebook working too to see if I can tape this all together.
One question I have: Assuming I implement the custom trainer approach above, at inference time for multi label token classification, do you just take the individual output logits and run them through a sigmoid activation to get your final per-label-per-token probabilistic values (as opposed to single label which runs them through a softmax)? Or is it something more complex than that?
@murdockthedude . I donāt think you even need to bother with sigmoid ā you can just pass the logits through the sign function as in lambda x: 1 if x > 0 else 0. There are efficient TF and pytorch functions for that.
Hey, sorry for opening this old thread again, this looks pretty much exactly like what I am looking for at the moment. But how did you actually succeed in passing the one-hot encoded labels through to the Trainer?
Whenever I try this I am getting errors thrown by DataCollatorForTokenClassification that it expects the labels to be integers.
@BunnyNoBugs@murdockthedude
I am working on morphological analysis problem where each token has multiple labels. Can you share sample notebook / working example so that I can understand and experiment with my problem ?
You can take a look here. The code is a bit messy since it was done under time pressure in the end, but it gets the multi-token classification done, and we achieved good results with it.