I want to fine-tune a model…
model = BertForTokenClassification.from_pretrained('monilouise/ner_pt_br'
with this dataset:
raw_datasets = load_dataset('lener_br')
aw_datasets loaded are already tokenized and encoded. And I don’t know how it was tokenized. Now, I want to pad the inputs, but I don’t know how to use
DataCollatorWithPaddings in this case.
I noticed that this dataset is similar to
wnut dataset from the docs. Still, I can’t figure out what should I do.