I’d like to know if it’s possible to modify the output head in order to output (generate) only words that are present in the given context or in a given vocabulary/vocabulary size ? And how would you proceed ?
I’m currently trying with T5ForConditionalGeneration but i think that could be applied to any encoder/decoder text generating model.
I was thinking of last linear layer with output_size equals to max input size and iterate on the decoder to generate words distribution until end token. I’m a bit confuse on how to do so… if you have any resources that could help i would really appreciate your help. Thanks !