Using LLM cache

Hi,

I’m working with an LLM model that generates text in stages. It stops after each section, performs an action, integrates the action’s result into the generated text, and then continues. While this approach works, I’m facing a problem in time it takes.

I’ve tried using a static cache, but it actually increased processing time.

Is there a way to optimize this process?