How to deploy a T5 model to AWS SageMaker for fast inference?

@philschmid: very stange. I think I found how to pass parameters but when I pass the same parameters than the ones I used in a Colab notebook, I got 2 different predictions…

Code from my Colab notebook

model_name = "xxx"
API_TOKEN = 'xxxx' # API token 
max_target_length = 32 
num_beams = 1

text2text = pipeline(
    "text2text-generation",
    model=model_name,
    use_auth_token=API_TOKEN,
    num_beams=num_beams,
    max_length=max_target_length
) 

# put a prefix before the text
input_text = "xxxxx" # one sentence

# get prediction
pred = text2text(input_text)[0]['generated_text']

# print result
print('input_text |',input_text)
print('prediction |',pred)

Code I use in the AWS SageMaker Deploy notebook

input_text = "xxxx"

data= {
    "inputs":input_text,
    "parameters": {
        "max_length":32, 
        "num_beams":1, 
    }
}

# request
predictor.predict(data)