Ways to reduce memory consumption in Q&A tasks without damage (or at least, not that much) the accuracy?

i’m facing this problem: I’m trying to spend less memory in my Q&A task using bert. I debugged my steps and saw that the start_logits and end_logits

start_logits, end_logits = model(**inputs)

costs more than 11gb of ram. Is there any ways to solve this? I mean, use less memory to perform this task without harm my model accuracy? If so, can someone share some of them? And some alternative ways in case is not possible to do this?