Earlier I was using generate
without length_penalty
and no_repeat_ngram_size
, and after using these two params, inference has slowed down significantly (more than 2x). Is this the intended behaviour here ?
Earlier I was using generate
without length_penalty
and no_repeat_ngram_size
, and after using these two params, inference has slowed down significantly (more than 2x). Is this the intended behaviour here ?