I have tried so many variations of completely different code and I can’t get it working. I can train any of the smaller t5 models, but once it requires multi-gpu, I can’t get it to work. Tried deepspeed, accelerate, and solutions without using either of those.
So, my question is, does anyone have some code that runs t5-11b for fine tuning? (or know where some is)