I recommend you use llama-instruct model
I agree. Also, it seems you are using transformers, but that software is more of a library suited to advanced users, for detailed specialized tasks, as well as customizing models and training.
If your main use is for chatting, I recommend using Ollama, which is simpler, faster, and based on the Instruct model.