Hi,
When viewing the top predicted tokens in masked language modelling (MLM), is it possible to use top_k
with k=len(vocab)
?
So far, I have used this following line of code:
mask_filler("The capital of [MASK] is Paris", top_k=5)
Therefore, can k=len(vocab)
be used, so that the predictions come from my vocabulary or not?
Thanks!