Is There a Way to Improve Memory Usage When Using Identical `past_key_values` for All Samples in a Batch?

this page in the docs should help

2 Likes