I am currently working on sentiment analyses using bert’s pooled output or classification task model in pytorch. I would want to see the importance weight of each word in a sentence which is leading me to the sentiment.
More formally, I want a batch_size x max_length size tensor, where each value of the matrix is the weight indicating the importance of that word on the ith row.
I’ve seen many suggesting to look at the 11th layer of bert but I am unable to find anything like that. Any help regarding this matter, please?