The current text generation call will exceed the model's predefined maximum length

I’m currently working on a generation task using transformers and have encountered an issue. I’m using the NSQL-2B model, and the generation configuration is set as follows:

GenerationConfig(num_beams=5, early_stopping=True, pad_token=self.__model.config.pad_token_id, max_new_tokens=2048, do_sample=False,temperature=0.1)

I got the following warning when generating the text:

This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (2048). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.

Followed by this error:

index 2048 is out of bounds for dimension 0 with size 2048)

The model max length is : 2048

I’m looking for advice or solutions on how to handle this. Should I adjust the max_new_tokens parameter, or is there another way to manage this limitation without compromising the generation quality? Any insights or experiences with similar issues would be greatly appreciated.

Thank you!