Actually, you could transfer the weights from the old classifier to the new one.
In pytorch, if your classifier is this:
classifier1 = nn.Linear(hidden_size, num_labels)
it will have shape (num_labels, hidden_size)
So if you add more k
more labels, the new classifier will be
classifier2 = nn.Linear(hidden_size, num_labels+k)
and it will have shape (num_labels+k, hidden_size)
You would then do
classifier2.weight.data[:num_labels, :] = classifier1.weight.data
You’ll still need to train on samples that have all the labels (otherwise it will forget the originals), and you need to make sure that the ids for each label stay the same. i.e. if your label2id is originally {"person": 0, "org": 1, "misc": 2}
, the new label2id should be {"person": 0, "org": 1, "misc": 2, "price": 3, "product_names": 4}
oh and you should init the classifer2 weights before moving the weights. It’s common to do something like this:
module.weight.data.normal_(mean=0.0, std=std)
if module.bias is not None:
module.bias.data.zero_()
where std is the initializer range in the config. usually 0.02