Deploy distiluse-base-multilingual-cased-v2 on Sagemaker

ylevine84 · August 21, 2023, 5:32pm

I’m trying to figure out if I can use sentence_transformers/distiluse-base-multilingual-cased-v2 as a real-time inference endpoint on AWS Sagemaker to retrieve embeddings and run a model.

Using this guide from @philschmid on AWS I see how it uses the following code with the Transformers library to load a model stored in an S3 bucket and create and return the embeddings. This code is provided on most model pages to load the model with Transformers, but not on the distiluse-base-multilingual-cased-v2 page.

I’m able to upload it as in the tutorial, but the vectors returned from Sagemaker are len(768) instead of 512 when I load the model with the sentence_transformers library. I know that this model has a custom dense layer that doesn’t allow it to be fine-tuned with Transformers, and I’m assuming that that when I load the model with Transformers it is missing the custom dense layer at the end. I’m wondering if there are any suggestions that would allow me to load the complete model into Sagemaker and return embeddings of len(512)

Thanks!!

pserotini · January 25, 2024, 6:50pm

Were you able to solve it? Same problem here…

Topic		Replies	Views
How do I reduce DistilBERT model size? Models	6	4843	April 12, 2021
Fine-tune sentence transformer model on SageMaker Amazon SageMaker	0	993	December 21, 2022
Finetuning sentence embedding model with SageMaker - how to compute loss? Amazon SageMaker	9	3951	December 21, 2022
Use my finetuned Bert Model in SageMaker BatchTransform Amazon SageMaker	4	2968	April 30, 2022
InternalServer Exception when deploying fine tuned model on Sagemaker Amazon SageMaker	4	858	September 14, 2021

Deploy distiluse-base-multilingual-cased-v2 on Sagemaker

Related topics