Extract hidden layers from a Roberta model in sagemaker


I have fine tuned a Camembert Model (inherits from Roberta) on a custom dataset using sagemaker.
My goal is to have a language model able to extract embedding to be used in my search engine.

Camembert is trained for a “fill-mask” task.
Using the Huggingface API outputting hidden_layers (thus computing embedding) is fairly simple
model = AutoModelForMaskedLM.from_pretrained(args.model_name, output_hidden_states=True)

But when deploying such model in sagemaker the predict method only returns the text output.
There is some kind of post-processing that I do not control.

Is there a way to customize the post-processing steps in sagemaker ?
What model architecture should I be using to extract embeddings on sagemaker ?

Thanks for your help.