I have a case where I can’t use SFTTrainer because:
- My data is pretokenized
- I need to use predict_with_generate to get validation set evals during training
The regular Trainer can handle pretokenized data but doesn’t have predict_with_generate.
Can I use Seq2SeqTrainer or does that only make sense for encoder-decoder models?
I’m fine tuning a 7B LLM with Qlora.
Thanks!