I have a dataset in the form (input_text, embedding_of_input_text)
, where embedding_of_input_text
is an embedding of dimension 512 produced by another model (DistilBERT) when given as input input_text
.
I would like to fine-tune BERT on this dataset such that it learns to produce similar embeddings (i.e. a kind of mimicking).
Furthermore, by default BERT returns embeddings of dimension 768, while here embedding_of_input_text
are embeddings of dimension 512.
Which is the correct way to to that within the HuggingFace library?