I am trying to use encoder decoder cross attention matrices in a project I am working on.
I only need the attention matrix of the last token that is produced. output_scores=True produces it for all tokens, I only need the last. Is there a way to access it seperately, to avoid consuming so much memory?