How to filter predicted tokens in masked language modelling?

Hello,

I have trained a masked language model using my own dataset, which contains sentences with emojis (trained on 20,000 entries).

Now, when I make predictions, I want emojis to be in the output, however, most of the predicted tokens are words, so I think that the emojis are right at the bottom of the list somewhere, as they must be less frequent compared to the words.

So far, this is my output - you can see that one emoji has been predicted, but the rest of the predictions are words:

mask_filler("I am so good today <mask>", top_k=5)

[{'score': 0.2953376770019531,
  'sequence': 'I am so good today."',
  'token': 72,
  'token_str': '."'},
 {'score': 0.18523386120796204,
  'sequence': 'I am so good today 🙂',
  'token': 328,
  'token_str': '🙂'},
 {'score': 0.1431082785129547,
  'sequence': 'I am so good today!"',
  'token': 2901,
  'token_str': '!"'},
 {'score': 0.13269349932670593,
  'sequence': 'I am so good today.',
  'token': 4,
  'token_str': '.'},
 {'score': 0.030341114848852158,
  'sequence': 'I am so good today :)',
  'token': 44660,
  'token_str': ' :)'},

Therefore, I was wondering if there is any code or functions that can filter the predictions, so that there are only emojis in the output.

I have got 1 emoji to show in the output, but I think the rest of the emojis are less frequent tokens, so they are not appearing at the top when I make predictions.

So, is it possible to filter it to make emojis appear and cancel out the words?

Thanks.