Named Entity Recognition: fine-tune or create new model?

antcodes · February 10, 2023, 10:45pm

Hello,

I have the bert-base-NER model running on my system.

But I want to recognize other kinds of entities, for example prices and product names. I have my own data for this that I can label to train the model.

Is this something that can be achieved by fine-tuning the bert-base-NER model?

Or should I create a model from scratch instead?

Thanks for any advice.

nbroad · February 11, 2023, 2:48am

You can’t just add a few labels to a pre-existing model. I would recommend starting from a fine-tuned model, but you’ll need labels for all entities that you are trying to detect, including the old ones.

antcodes · February 11, 2023, 3:57am

Thanks for your reply. So just to clarify: you’re saying that I can do this by fine-tuning something like bert-base-NER, as long as in my training dataset, I label everything that I want to be able to detect?

nbroad · February 11, 2023, 4:23am

Actually, you could transfer the weights from the old classifier to the new one.

In pytorch, if your classifier is this:


classifier1 = nn.Linear(hidden_size, num_labels)

it will have shape (num_labels, hidden_size)

So if you add more k more labels, the new classifier will be


classifier2 = nn.Linear(hidden_size, num_labels+k)

and it will have shape (num_labels+k, hidden_size)

You would then do


classifier2.weight.data[:num_labels, :] = classifier1.weight.data

You’ll still need to train on samples that have all the labels (otherwise it will forget the originals), and you need to make sure that the ids for each label stay the same. i.e. if your label2id is originally {"person": 0, "org": 1, "misc": 2}, the new label2id should be {"person": 0, "org": 1, "misc": 2, "price": 3, "product_names": 4}

oh and you should init the classifer2 weights before moving the weights. It’s common to do something like this:

module.weight.data.normal_(mean=0.0, std=std)
if module.bias is not None:
    module.bias.data.zero_()

where std is the initializer range in the config. usually 0.02

Topic		Replies	Views
Retraining pre-trained NER model with new data samples 🤗Transformers	1	398	May 3, 2024
How to fine tune bert on entity recognition? Beginners	23	7359	November 21, 2022
Retrain/reuse fine-tuned models on different set of labels Beginners	7	4924	April 8, 2021
Fine-tuning Bert/Roberta for multi-label sentiment analysis Beginners	0	1600	November 8, 2021
Initialising BERT Model Models	0	277	June 9, 2023

Named Entity Recognition: fine-tune or create new model?

Related topics