I’ve deployed the Hate-speech-CNERG/bert-base-uncased-hatexplain multi-class text classification model (along with others) to a SageMaker endpoint. Using the TextClassificationPipeline you’re able to pass in the return_all_scores=True parameter to see all class labels for the input.
However when using the SageMakerRuntime.invoke_endpoint function, I can only get one class per input. Anyone have any thoughts on an equivalent to the return_all_scores parameter?
Hi Will, when deploying a model to a Sagemaker endpoint you can provide a custom inference script in which you can control the behaviour of the model for inference requests. You can find the documentation on how to do this here: Deploy models to Amazon SageMaker
In your particular scenario I would think hat you can override the predict_fn
and/or the output_fn
methods to get to the desired result. You can find an example for an inference.py
file here (note that this particular file is for text summarization, so it won’t apply for your use case).
To see what code you need to get all scores returned you can refer back to the code from the TextClassification Pipeline, that you have already mentioned: transformers/text_classification.py at master · huggingface/transformers · GitHub. If you implement this code into your inference script you should be all set.
Hope that helps, let me know if any questions.
Cheers
Heiko
I would like to do the same too. Has anyone tried it yet?
I just had a look into the code of the Huggingface Sagemaker Inference Toolkit and found an even easier way to accomplish this: You can pass parameters
in your inference request: sagemaker-huggingface-inference-toolkit/handler_service.py at main · aws/sagemaker-huggingface-inference-toolkit · GitHub
That’s awesome, because it means you don’t even have to provide an inference.py
script! 
All you need to do is:
predictor.predict({"inputs": "I love using the new Inference DLC.", "parameters": {"return_all_scores": True}})
Hope that helps!
3 Likes
We have also a documentation section available for how the API works: Reference
The Inference Toolkit accepts inputs in the inputs
key, and supports additional pipelines
parameters in the parameters
key. You can provide any of the supported kwargs
from pipelines
as parameters
.
{
"inputs": "This sound track was beautiful! It paints the senery in your mind so well I would recomend it
even to people who hate vid. game music!",
"parameters": {
"return_all_scores": True
}
}
3 Likes
Thanks both of you for your help! I will go ahead and try that
1 Like
@marshmellow77
I tried it on Sagemaker studio and it still gives only one label. May I ask you to please look at the screenshot to see if anything is not right? Thank you
1 Like
All looking good except that you are using an old version of the transformers library. Try transformers_version=“4.12.3”
, pytorch_version=“1.9.1”
, and py_version=“py38”
. That should work 
Just for reference, you can find the latest version of the DLCs here: Reference
Cheers
Heiko
1 Like
It finally worked. Thanks Marshmellow!
1 Like