Hi @philschmid.
I’m back to you about using AWS SageMaker for inference with a Text2Text-Generation model like T5.
My objective is to use an ONNX T5 model for inference but in order to understand the logic behind the SageMaker Hugging Face Inference Toolkit, I started with a T5 model from the HF hub.
I’m using for doing that your notebook deploy_transformer_model_from_hf_hub.ipynb.
It worked but I was surprised to get a different predicted text than the one I get when I use the model in a notebook.
As I understood that the deploy HF code in AWS SageMaker uses pipeline()
, my hypothesis is that arguments like num_beams
, max_length
have default values that I need to change.
Then, my question is: how to change the values of theses arguments in a deploy from AWS SageMaker? Thanks.