T5-Small Greedy Search out of memory

Hello,

I am trying to use the t5-small greedy search according to the example listed here. This simple operation results in a CUDA out of memory error on a V100 (32GB VRAM).
While using the generate function with the same model and same inputs works fine.

Any idea what could be causing this issue? Help is much appreciated!