When calling model.generate()
, the actual decoder used (greedy, beam search, typical sampling, etc.) is interpreted based on which args you pass into model.generate
. That selection is not logged or visible to the developer. While I was working on a demo of these different methods, it was surprisingly hard to show that model.generate()
calls are using the expected methods. For example if I set the typical_p arg, transformers only applies typical decoding if 0.0 < typical_p < 1.0 and do_sample=True. If I typo a kwarg it is silently passed on to the model.
I made a PR ( Log the decoder chosen by GenerationMixin by mapmeld · Pull Request #17196 · huggingface/transformers · GitHub ) which calls logging.info when choices are made (link included to demo notebook). I’m not sure if we want to add a new arg to activate this, logging.info by default, logging.warn in some cases… open to scrapping the current changes and making it look right.