Visualizing attention heatmaps of layoutlmv3

I’m looking to interpret the attention outputs from the LayoutLMv3ForTokenClassification model with sroie dataset from darentang/sroie for the key information extraction task. I am trying to visualize the heatmaps with respect to the input text and image. Is it possible to do this with the attentions attribute of the model output?

Any help is appreciated! Thank you.