How to increase max_new_tokens beyond 1200 in code llama

I am using codellama on inference end point, but i am unable to give max_new_tokens beyond 1100 in the parameters. it throws me this error:

"parameters": {
            "max_new_tokens": 1024, # adjust this value to generate more tokens
            "return_full_text": False,
            }

{‘error’: ‘Input validation error: inputs tokens + max_new_tokens must be <= 1512. Given: 345 inputs tokens and 1324 max_new_tokens’, ‘error_type’: ‘validation’}

Is there any way around this?

I thought I would have to tinker with the Endpoint side, but it looks like there is a way to do it.

Appreciate your response, but it is infact a container config of the end point. I updated it and am able to generate response up to 10000 tokens

1 Like