Hi !
I created a sagemaker serverless endpoint that serves a fine-tuned text classification model… Now, when I try to invoke it with a sequence length longer than the maximum input length (514) it correctly returns the following error:
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
"code": 400,
"type": "InternalServerException",
"message": "The expanded size of the tensor (997) must match the existing size (514) at non-singleton dimension 1. Target sizes: [1, 997]. Tensor sizes: [1, 514]"
}
To make sure that the model can handle any input length through truncation, I updated the models tokenizer_config.json
with an additional argument "model_max_length": 514
but unfortunately the error remains the same.
Am I working on the wrong part of the model? Do I have to set it in tokenizer.json
?
Looking forward to your expertise!
Regards,
David