I have a pipeline to finetune an instance of BertModel
, on a text-classification
task.
I would like to use this new embedding model as my base embedding now.
As can be seen in the example they provide, they are able to pass different values for matryoshka_dim
into the SentenceTransformer
instance through the truncate_dim
argument.
However, I was not able to do this on the BertModel
in the following code snippet that I have in my code:
self.bert_backbone = BertModel.from_pretrained(
pretrained_model_name_or_path=self.config.embedding_model_file.model_name,
cache_dir=Path(self.config.embedding_model_file.cache_dir),
).to(self.device)
And I do not want to use a SentenceTransformer
instance either as in my training loop I would like to be able to get:
bert_outputs: BaseModelOutputWithPoolingAndCrossAttentions = (
self.bert_backbone(
input_ids=input_ids, attention_mask=attention_mask
)
)
bert_logits: Tensor = bert_outputs.last_hidden_state[
:, 0, :
] # Take the [CLS] token output
and I am not sure if this code would work also with a simple swap to SentenceTransformer
. In any case, I think that this is a potential parameter that BertModel
should support, and maybe it does but I am just missing it.
Thanks in advance!