model.generate(), the actual decoder used (greedy, beam search, typical sampling, etc.) is interpreted based on which args you pass into
model.generate. That selection is not logged or visible to the developer. While I was working on a demo of these different methods, it was surprisingly hard to show that
model.generate() calls are using the expected methods. For example if I set the typical_p arg, transformers only applies typical decoding if 0.0 < typical_p < 1.0 and do_sample=True. If I typo a kwarg it is silently passed on to the model.
I made a PR ( Log the decoder chosen by GenerationMixin by mapmeld · Pull Request #17196 · huggingface/transformers · GitHub ) which calls logging.info when choices are made (link included to demo notebook). I’m not sure if we want to add a new arg to activate this, logging.info by default, logging.warn in some cases… open to scrapping the current changes and making it look right.