SageMaker Model | How to set Truncation within Config?

dbb · September 24, 2023, 2:16pm

Hi !

I created a sagemaker serverless endpoint that serves a fine-tuned text classification model… Now, when I try to invoke it with a sequence length longer than the maximum input length (514) it correctly returns the following error:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{
  "code": 400,
  "type": "InternalServerException",
  "message": "The expanded size of the tensor (997) must match the existing size (514) at non-singleton dimension 1.  Target sizes: [1, 997].  Tensor sizes: [1, 514]"
}

To make sure that the model can handle any input length through truncation, I updated the models tokenizer_config.json with an additional argument "model_max_length": 514 but unfortunately the error remains the same.

Am I working on the wrong part of the model? Do I have to set it in tokenizer.json?

Looking forward to your expertise!

Regards,
David

philschmid · September 25, 2023, 7:45am

Are you passing truncation ? as parameter. The pipeline is not by default truncating the inputs.

dbb · September 25, 2023, 7:59am

Hi @philschmid! No, currently I don’t.
I know that this solution exists but I’m wondering whether there is a way to configure the model itself to apply truncation by default.
The intuition behind it is to make the model available in the most simple way. So, my users should not worry about adding “nlp specific” parameters to their requests.

I’ll give truncation a try, just to see if that would be a workaround. Could I also pass max_length in the request?

dbb · September 25, 2023, 8:43am

I tried it with truncation in the request:

data = {
  "parameters": {"truncation": true},
  "inputs": "Text longer than 514 Tokens"
  }

res = predictor.predict(data=data)
print(res)

and it works as expected

Is there a way to make this the default behavior through the tokenizer config files?

I’d like to make my model available via an AWS API Gateway and to keep things separated my users should not worry about NLP specific topics like truncation and so on.

Topic		Replies	Views
NLP Truncation Parameter for Serverless Endpoint Beginners	0	290	November 4, 2022
Text Length FinBert - Serverless Inference Endpoint Amazon SageMaker	3	1472	November 5, 2022
URGENT HELP on Endpoint invokation 🤗Transformers	1	257	November 13, 2022
ValidationError: Max token limit(>=1) reached for finetuned models Amazon SageMaker	3	725	December 28, 2023
Truncation of input data for Summarization pipeline Amazon SageMaker	4	2636	November 16, 2021

SageMaker Model | How to set Truncation within Config?

Related topics