I have a question about output_attentions, and I need to make a heatmap about the attention of the final layer of the BERT model. But I do not know if the output_attentions[0] is the first or last layer. I tried to check the documentation, but I did not find it.
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Can we access attention component and feed-forward component of a Bert layer? | 1 | 892 | June 17, 2023 | |
Attentions not returned from transformers ViT model when using output_attentions=True | 4 | 263 | July 10, 2024 | |
Can I compare the attention of different encoder layers? | 0 | 195 | December 13, 2022 | |
Is attention of different encoder layers comprabale? | 0 | 261 | December 6, 2022 | |
Passing hidden states and attention | 0 | 295 | March 2, 2021 |