ValidationError: Max token limit(>=1) reached for finetuned models

seinfeld · December 9, 2023, 1:53pm

I finetuned the Llama2-7b model on sagemaker, but I get the token limit reached error during inference.
Here is the error that I get most of the times.


An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from primary with message "{"error":"Input validation error: `max_new_tokens` must be <= 1. Given: 500","error_type":"validation"}"

Does anyone have any idea why this is and how I can solve this?
@philschmid , maybe you can help?

philschmid · December 11, 2023, 9:45am

You can configure this via then environment variables.

MAX_INPUT_LENGTH': json.dumps(1024), # Max length of input text
  'MAX_TOTAL_TOKENS': json.dumps(2048), # Max length of the generation (including input text)

seinfeld · December 20, 2023, 7:25pm

Sorry for the late reply. This worked thanks.
For a longer input with system prompts, can we increase the input length or the token length?

dipakrimal · December 28, 2023, 5:03pm

Where should these be configured exactly? I ran into the same error while trying to invoke deployed sagemaker endpoint. I passed these along when deploying the finetuned estimator to the endpoint but still getting the same errors.

Topic		Replies	Views
Truncation of input data for Summarization pipeline Amazon SageMaker	4	2657	November 16, 2021
Limit max # of tokens for inference in pipeline? Beginners	0	1084	April 7, 2023
SageMaker Model \| How to set Truncation within Config? Amazon SageMaker	3	787	September 25, 2023
ClientErro:400 when using batch transformer for inference Amazon SageMaker	11	2238	January 13, 2022
Truncated un-finished response after deploying hugging-face models Amazon SageMaker	0	383	January 19, 2024

ValidationError: Max token limit(>=1) reached for finetuned models

Related topics