Customize pretrained model for model hub

Hi community,

I would like to add mean pooling step inside a custom SentenceTransformer class derived from the model sentence-transformers/stsb-xlm-r-multilingual, in order to avoid to do this supplementary step after getting the tokens embeddings.

My aim is to push this custom model onto model hub. If not using this custom step, it is trivial as below:

from transformers import AutoTokenizer, AutoModel

Simple export

## Instanciate the model
model_name = "sentence-transformers/stsb-xlm-r-multilingual"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

## Save the model and tokenizer files into cloned repository
model.save_pretrained("path/to/repo/clone/your-model-name")
tokenizer.save_pretrained("path/to/repo/clone/your-model-name")

However, after defining my custom class SentenceTransformerCustom I can’t manage to push on model hub the definition of this class. Do I need to place this custom class definition inside a specific .py file ? Or is there anything to do in order to correctly import this custom class from model hub?

Thanks!

1 Like