Hi ,
So I trained a Ner model using Bert.
Just wanted to understand if any of our labels would be predicted for [CLS] or [SEP] token ?
Is it normal or is there something I might have gone wrong with ?
While training I had assigned a -100 label to them so as they would be ignored during training .
A sample o/p when I run my prediction on a sentence , so CLS and SEP being encoded into 101 and 102 while encoding .
tok preds
0 101 0
1 1045 0
2 2215 0
3 2000 0
4 4965 0
5 1037 0
6 2417 0
7 18059 0
8 2007 0
9 16380 1
10 1021 2
11 102 2
Any inputs would be helpful for my understanding .