Hello,
I have fine tuned a Camembert Model (inherits from Roberta) on a custom dataset using sagemaker.
My goal is to have a language model able to extract embedding to be used in my search engine.
Camembert is trained for a “fill-mask” task.
Using the Huggingface API outputting hidden_layers (thus computing embedding) is fairly simple
model = AutoModelForMaskedLM.from_pretrained(args.model_name, output_hidden_states=True)
But when deploying such model in sagemaker the predict method only returns the text output.
There is some kind of post-processing that I do not control.
Is there a way to customize the post-processing steps in sagemaker ?
What model architecture should I be using to extract embeddings on sagemaker ?
Thanks for your help.
For those having the same issue I found a solution.
Train the model on masked ML and at inference time use the pipeline ‘feature_extraction’ by setting the HF_TASK environment variable.
hub = {
'HF_TASK': 'feature_extraction'
}
huggingface_model = HuggingFaceModel(
env=hub,
model_data="s3://bucket/model.tar.gz"
role=<SageMaker Role>,
transformers_version="4.6",
pytorch_version="1.7",
py_version="py36",
)
huggingface_model.deploy(1, '<ec2 -instance type >')
the model will send the feature vector for each token. To get the sentence vector you can average the word vectors or use other fancy methods I dit not explore.
If you want more control on what exactly your model returns you can customize the
output_fn, predict_fn etc… as described here: GitHub - aws/sagemaker-huggingface-inference-toolkit