The documentation for the label parameter for BertForTokenClassification says that
Indices should be in
[0, ..., config.num_labels - 1]
But BertConfig doesnât have a num_labels parameter as far as I can tell, so what is this config.num_labels argument?
Also, in this tutorial it says that we can set the labels we want the model to ignore, to -100. If that is correct, why doesnât the documentation for BertForTokenClassification mention it? Maybe itâs not correct, because when I make my labels like this, I get the error
/opt/conda/conda-bld/pytorch_1591914880026/work/aten/src/THCUNN/ClassNLLCriterion.cu:108: cunn_ClassNLLCriterion_updateOutput_kernel: block: [0,0,0], thread: [27,0,0] Assertion
t >= 0 && t < n_classes
failed
which indicates to me that I can not have labels with values outside the interval [0, n_classes].
What I have are labels for which class each wordpiece belongs to, after tokenization. Then I add the special tokens and padding, and Iâm setting labels for the special tokens to -100. So for example if I want a sequence length of 10, and I want to classify wordpieces with an âoâ in them as class 1, and wordpieces with a âpâ in them as class 2, I would have, for the sentence âOh, that school is pretty cool
â:
Tokens: [âohâ, â,â, âthatâ, âschoolâ, âisâ, âprettyâ, âcoolâ]
With special tokens: [â[CLS]â, âohâ, â,â, âthatâ, âschoolâ, âisâ, âprettyâ, âcoolâ, â[SEP]â, â[PAD]â]
Labels: [-100, 1, 0, 0, 1, 0, 2, 0, -100, -100]