Blenderbot 1.0B Distilled eats up memory over many inferences

Hi, I’ve noticed that over the course of many inferences, the Blenderbot 1.0B Distilled model continuously allocates GPU memory and eventually causes the GPU to crash. My project only uses single-turn inferences, and I was wondering how to prevent Blenderbot from continuously allocating memory. Thanks!