I’ve finetuned ctrl
, and it seems that greedy decoding with repetition_penalty=1.2
and temperature=0
and top_k sampling ('max_length': 64, 'do_sample':True, 'max_length':64, 'top_k':50, 'top_p':0.95
)
produces empty sequences often when varied examples are provided as context. I wanted to know what is the reason for this usually ? I tried playing with parameters in model.generate
, but they didn’t seem to help much directly. Also, some checkpoints in between are able to generate text for that sample. What is the reason usually ?
Hi I have same question
Did you figure out why this problem occurs !?
Maybe need some format for prompt, such as decorate the context with question-answer format of f’Q:{context}\nA:’ to let finetuned model to generate the text you want.