Our team is using BERT/Roberta from the huggingface transformers library for sequence-classification (amongst other tasks). We are looking for an efficient way to monitor the attention matrices so as to understand what the model is doing during inference (i.e. the model made this prediction because it is focusing on these words, etc). Are there any useful code snippets used for analysis.
Often the models make funny predictions, and it’s hard to understand why… How are other teams managing this process? We want to avoid large bloated (graphical) tools, and would prefer simplicity.
thanks!