I have a question about output_attentions, and I need to make a heatmap about the attention of the final layer of the BERT model. But I do not know if the output_attentions[0] is the first or last layer. I tried to check the documentation, but I did not find it.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Can we access attention component and feed-forward component of a Bert layer? | 2 | 936 | September 23, 2024 | |
Attentions not returned from transformers ViT model when using output_attentions=True | 4 | 592 | July 10, 2024 | |
Can I compare the attention of different encoder layers? | 0 | 196 | December 13, 2022 | |
Is attention of different encoder layers comprabale? | 0 | 267 | December 6, 2022 | |
Finding Serverless Inference APIs that support attention outputs (output_attentions = true) | 0 | 132 | March 19, 2024 |