Is it possible to see all the token rankings for masked language modelling?

anon58275033 · June 30, 2021, 11:02am

Hi,

I was just wondering whether it would be possible to see all the predicted tokens for masked language modelling? Specifically, all the tokens with a low probability.

For example, consider this masked language model:

unmasker("I am feeling <mask> today")

[{'score': 0.5322356820106506,
  'sequence': 'I am feeling good today',
  'token': 4,
  'token_str': good'},
 {'score': 0.1725485771894455,
  'sequence': 'I am feeling happy today!',
  'token': 328,
  'token_str': 'happy'},
 {'score': 0.1252109706401825,
  'sequence': 'I am feeling sad today."',
  'token': 72,
  'token_str': 'sad"'},
 {'score': 0.01904081553220749,
  'sequence': 'I am feeling angry today!"',
  'token': 2901,
  'token_str': 'angry'},
 {'score': 0.012199202552437782,
  'sequence': 'I am feeling fun today…',
  'token': 1174,
  'token_str': 'fun'}]

As you can see from my output, the top tokens are “good”, “happy”, “sad”, “angry” and “fun”. However, would it be possible to see all the predicted tokens beyond the top 5?

I just want to see all a list of all the predicted tokens: the ones which have the lowest probability - if this is possible.

I don’t want to see the top 5 predicted; I want to see all of them.

Thanks.

Felipehonorato · June 17, 2022, 12:48pm

I guess the only way to do that is to work with the model outside the pipeline method. Therefore you can use the logits to infer the probabilities of any token you want.

Topic		Replies	Views
Is it possible to filter the predicted tokens in masked language modelling? Beginners	0	240	July 26, 2021
How to filter predicted tokens in masked language modelling? Beginners	0	261	July 23, 2021
Unmasker probabilities for all tokens in sequence 🤗Transformers	0	223	December 23, 2022
Self-pretrained model predicts token with -1 index gap 🤗Transformers	0	667	February 22, 2022
How to get the index of the masked token after passing the sentence to the model 🤗Transformers	3	2820	September 8, 2020

Is it possible to see all the token rankings for masked language modelling?

Related topics