ValidationError: Max token limit(>=1) reached for finetuned models

I finetuned the Llama2-7b model on sagemaker, but I get the token limit reached error during inference.
Here is the error that I get most of the times.


An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (422) from primary with message "{"error":"Input validation error: `max_new_tokens` must be <= 1. Given: 500","error_type":"validation"}"

Does anyone have any idea why this is and how I can solve this?
@philschmid , maybe you can help?

You can configure this via then environment variables.

MAX_INPUT_LENGTH': json.dumps(1024), # Max length of input text
  'MAX_TOTAL_TOKENS': json.dumps(2048), # Max length of the generation (including input text)

Sorry for the late reply. This worked thanks.
For a longer input with system prompts, can we increase the input length or the token length?

Where should these be configured exactly? I ran into the same error while trying to invoke deployed sagemaker endpoint. I passed these along when deploying the finetuned estimator to the endpoint but still getting the same errors.