I’ve finetuned ctrl
, and it seems that greedy decoding with repetition_penalty=1.2
and temperature=0
and top_k sampling ('max_length': 64, 'do_sample':True, 'max_length':64, 'top_k':50, 'top_p':0.95
)
produces empty sequences often when varied examples are provided as context. I wanted to know what is the reason for this usually ? I tried playing with parameters in model.generate
, but they didn’t seem to help much directly. Also, some checkpoints in between are able to generate text for that sample. What is the reason usually ?