I would like to take a pretrained model and only train new embeddings on a corpus, leaving the rest of the transformer untouched. Then, fine tuning on a task without changing the original embedding. Finally, swapping the embedding. All in all, how can I have control over only training the embeddings, leaving the embeddings untouched in training and swapping the embeddings of a model with the Hugging Face Transformer library ?
This is to follow the following approach taken in this article:
- Pre-train a monolingual BERT (i.e. a transformer) in L1 with masked language modeling
(MLM) and next sentence prediction (NSP)
objectives on an unlabeled L1 corpus.
- Transfer the model to a new language by learning new token embeddings while freezing the
transformer body with the same training objectives (MLM and NSP) on an unlabeled L2
- Fine-tune the transformer for a downstream
task using labeled data in L1, while keeping
the L1 token embeddings frozen.
- Zero-shot transfer the resulting model to L2
by swapping the L1 token embeddings with
the L2 embeddings learned in Step 2.
Thank you !