When applying decoding, it is common to provide decoding params , such as
min_length , max_length, beam_size , length penalty etc.
I wonder if anyone is aware of a methodology or research for determining these params as well as if these could be dynamic and not hard-coded.
I have found this paper, for multi-document summarization,
if anyone knows any additional resources - it would be highly appreciated.