Getting explanation for BERT classifications

tansaku · December 31, 2022, 5:40pm

So I’ve been using BERT models to do text classification and it’s all going great, but I was jut wondering if there was any way to have a BERT model provide some sort of explanation for the classification it makes?

So for example say that text has been classified as A rather than B, can we then get back a heatmap of what part of the text, or what relations in the text, or what words in the text, contributed to the text being more likely an A than a B?

And indeed separately from an individual classification is there some way to visualize the overall kinds of things that the BERT model is focused on for a given data set (assuming that we’ve tweaked it on some dataset) or some kind of summary of the key features it is relying on in the larger dataset that it was originally trained on?

Many thanks in advance

tansaku · January 11, 2023, 12:49pm

I have found this blog Visualize BERT sequence embeddings: An unseen way | by Tanmay Garg | Towards Data Science and this technical paper Visualizing and Measuring the Geometry of BERT | DeepAI - I’ll need to spend some time digesting them to see if it’s what I’m thinking of

Topic	Replies	Views
Bert for Text classification evaluation - help needed Beginners	198	September 7, 2023
Looking inside classification after BERT Beginners	221	June 4, 2021
Finding sub-classes for a class in a text classification task Beginners	391	September 13, 2022
Classify remarks into predefined questions Beginners	137	August 21, 2023
Multiclass Classification: "labels" format Beginners	670	October 26, 2022

Getting explanation for BERT classifications

Related topics