In my current project, I am working on training encoder-decoder models (BART, T5, etc.) and the Transformers library has been absolutely invaluable! After seeing several Bertology analyses (i.e. looking at the information the model’s attention mechanism learns to attend to), I would like to know if a similar analysis is possible with the BART and T5 models in the Hugging Face library. Any recommendations are certainly appreciated!
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Optimal methods to monitor attention matrices when doing training/inference using BERT-type models | 2 | 679 | September 11, 2021 | |
BART with custom encoder and decoder | 5 | 891 | May 25, 2023 | |
How to use BART as an encoder and a decoder separately for summarization? | 1 | 782 | September 22, 2021 | |
NLP Sense Making | 0 | 412 | March 31, 2022 | |
Hugging Face Tutorials - Basics / Classification tasks | 1 | 392 | January 3, 2022 |