Extract hidden layers from a Roberta model in sagemaker

Hello,

I have fine tuned a Camembert Model (inherits from Roberta) on a custom dataset using sagemaker.
My goal is to have a language model able to extract embedding to be used in my search engine.

Camembert is trained for a “fill-mask” task.
Using the Huggingface API outputting hidden_layers (thus computing embedding) is fairly simple
model = AutoModelForMaskedLM.from_pretrained(args.model_name, output_hidden_states=True)

But when deploying such model in sagemaker the predict method only returns the text output.
There is some kind of post-processing that I do not control.

Is there a way to customize the post-processing steps in sagemaker ?
What model architecture should I be using to extract embeddings on sagemaker ?

Thanks for your help.

For those having the same issue I found a solution.

Train the model on masked ML and at inference time use the pipeline ‘feature_extraction’ by setting the HF_TASK environment variable.

hub = {
  'HF_TASK': 'feature_extraction' 
}
huggingface_model = HuggingFaceModel(
   env=hub,
   model_data="s3://bucket/model.tar.gz"
   role=<SageMaker Role>, 
   transformers_version="4.6", 
   pytorch_version="1.7", 
   py_version="py36",
)
huggingface_model.deploy(1, '<ec2 -instance type >')

the model will send the feature vector for each token. To get the sentence vector you can average the word vectors or use other fancy methods I dit not explore.

If you want more control on what exactly your model returns you can customize the
output_fn, predict_fn etc… as described here: GitHub - aws/sagemaker-huggingface-inference-toolkit