For long documents, I don’t think there’s an ideal solution right now. If truncation isn’t satisfactory, then the best thing you can do is probably split the document into smaller segments and ensemble the scores somehow.
I do see lot of high scores (> 0.9) when multi_class = True for list of custom tags …
Yeah unfortunately this will just happen sometimes It’s the reason why multi_class=False
is recommended when possible. It’s a lot easier to tell which one of K labels is the correct label rather than independently predicting each label based on the class name alone, as you do when multi_class=True
. You might have to just try out a bunch of examples and see what threshold works best. It’s just a really hard problem to tell whether the class name y applies to the sentence x without any training data or additional context. So far this method is the best I’ve encountered, but hopefully we can improve with time.