Ask for help with prediction results of Named Entity Recognition Task

g3casey · May 17, 2021, 1:11pm

It looks like you are not using the “fast” version of the tokenizer. Check to make sure.

https://huggingface.co/transformers/model_doc/roberta.html#robertatokenizerfast
from transformers import RobertaTokenizerFast
tokenizer = RobertaTokenizerFast.from_pretrained("roberta-base")

tokenizer(“Hello world”)[‘input_ids’]
[0, 31414, 232, 328, 2]
tokenizer(" Hello world")[‘input_ids’]
[0, 20920, 232, 2]

Topic		Replies	Views
How to handle <s> and </s> tags for custom NER using RoBERTa? Beginners	0	734	May 19, 2022
How to fine tune bert on entity recognition? Beginners	23	7491	November 21, 2022
Tokenization in a NER context 🤗Tokenizers	5	5846	August 11, 2021
[HELP] NER task single sentence/sample prediction 🤗Transformers	2	1411	August 25, 2021
Punctuation and Spaces in RoBERTa Tokenizer for NER with Pre-tokenized Data 🤗Transformers	0	602	January 16, 2022

Ask for help with prediction results of Named Entity Recognition Task

Related topics