Hi everyone,
I created and trained two bert-base-uncased models, using the run_ner.py script from the huggingface transformers examples, to predict one the PoS-tags and one the DEPREL-tags (both are attributes of the CoNLL-U Format).
I trained the two models separately on the same dataset to which I made some changes: in the first model I made it predict the labels corresponding to the PoS-tags and in the second model I made it predict the labels corresponding to the DEPREL-tags.
Once the training is finished, I loaded the weights (AutoModelForTokenClassification.from_pretrained), the configurations (AutoConfig.from_pretrained) and the tokenizers (AutoTokenizer.from_pretrained) of the two models using the functions made available by the library.
By doing this I get two different trainers, like the following:
trainer_pos_tag = Trainer(
model=model_pos_tag,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
compute_metrics=compute_metrics,
)
trainer_deprel_tag = Trainer(
model=model_deprel_tag,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
compute_metrics=compute_metrics,
)
with their respective configurations and tokenizers.
What I would like to achieve is to use these two models to predict both the PoS-tag and the DEPREL-tag at the same time (so I think it is necessary to be able to have a single model that predicts both labels, right?).
Now my problem is that I necessarily need to merge these two models (merge their weights (does that make sense?) or something like that) before doing the prediction (trainer.predict(test_dataset)
) operation.
How can I do?
Do you have any suggestions?
P.s. The important thing is that the model resulting from the union/merge of these two models always has a Trainer
type.
Thanks in advance!
The question is also on StackOverflow with the same title (I can’t put more than 2 links in the post because I’m a new user).