Weird error when trying to generate response from fine-tuned model

I’ll start by saying sorry for the lack of good context, since i’m a beginner, and i’m rather lost on most aspects of LLM fine-tuning, but ask away and i’l provied as much info as necessary to answer my question.

I’ve been having the following error when trying to use the generate method on the model i fine-tuned using my own data. The whole code can be found in the google colab workspace link.

Here’s the link to my google colab workspace: GCLB Notebook

And finally here’s the model i trained: Hugging Face - sabia-essay-correction

bitsandbytes version: 0.43.0
torch version: 2.2.1+cu121
transformers version: 4.38.2
peft version: 0.10.0
accelerate version: 0.28.0
datasets version: 2.18.0
einops version: 0.7.0

Also, here’s the original model mine’s based on: Hugging Face - sabia-7b