Slow speed with large context

Hello everyone. I am using ibm granite 20b model for code generation task, its working pretty good but when I make my prompt and examples in prompt longer, it gets very slow… Can anyone tell how can I make it faster with longer prompts. I have already applied quantization etc