Hi,
I am loading flan t5 xxl sharded version using âphilschmid/flan-t5-xxl-sharded-fp16â for finetuning. I want to do multi-gpu training using this. After loading the model from AutoModelForSeq2SeqLM, i am setting this parameter setattr(model, âmodel_parallelâ, True) but i am getting following error while training using trainer.
AttributeError: âT5Stackâ object has no attribute âfirst_deviceâ
Can someone help with this issue?