Using Seq2SeqTrainer for decoders?

I have a case where I can’t use SFTTrainer because:

  1. My data is pretokenized
  2. I need to use predict_with_generate to get validation set evals during training

The regular Trainer can handle pretokenized data but doesn’t have predict_with_generate.
Can I use Seq2SeqTrainer or does that only make sense for encoder-decoder models?

I’m fine tuning a 7B LLM with Qlora.

Thanks!

1 Like