I am trying to define my own token classification model class in PyTorch
so that it can be used in a similar way to RobertaForTokenClassification
class from tranformers and that the trained model can be saved using save_pretrained
and reloaded using from_pretrained.
I thought it would be easiest to slightly modify the code of RobertForTokenClassification from tranformers.
The definition of MyModelForTokenClassification class is as follows:
class MyConfig(RobertaConfig):
def init(self, **kwargs):
self.additional_parameter = …
super().init(**kwargs)
class MyModelForTokenClassification(RobertaPreTrainedModel):
config_class = MyConfig
def __init__(self, config, additional_data = None):
super().__init__(config)
self.additional_data = additional_data
self.additional_parameter = config.additional_parameter
self.num_labels = config.num_labels
self.model = RobertaModel(config, add_pooling_layer=False)
classifier_dropout = (
config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
)
self.dropout = torch.nn.Dropout(classifier_dropout)
hidden_size = config.hidden_size
if self.additional_data is not None:
hidden_size = .....
self.classifier = torch.nn.Linear(hidden_size, config.num_labels)
self.post_init()
def forward(.....):
......
In order to use AutoModelForTokenClassification, I added these 2 lines of code:
AutoConfig.register(“roberta”, MyConfig, exist_ok=True)
AutoModelForTokenClassification.register(MyConfig, MyModelForTokenClassification, exist_ok=True)
When creating a model for training:
model = AutoModelForTokenClassification.from_pretrained(“roberta-base”,
num_labels=3,
id2label=id2label,
label2id=label2id,
additional_parameter=…,
additional_data=…)
this message appears:
Some weights of MyModelForTokenClassification were not initialized from the model checkpoint
at roberta-base and are newly initialized:
[‘roberta.model.encoder.layer.9.attention.self.query.weight’,
‘roberta.model.encoder.layer.9.attention.output.LayerNorm.weight’,
‘roberta.model.encoder.layer.11.output.LayerNorm.bias’,
‘roberta.model.encoder.layer.3.output.LayerNorm.weight’,
‘roberta.model.encoder.layer.4.attention.self.value.bias’,
‘roberta.model.encoder.layer.5.intermediate.dense.bias’,
‘roberta.model.encoder.layer.10.output.dense.weight’,
‘roberta.model.encoder.layer.5.attention.output.dense.weight’,
…]
It shows that the model has not been initialised with the weights from ‘robert-base’.
What else should I do to initialise MyModel with the weights from the ‘robert-base’ model?