Flan-t5 multi gpu training issue

Hi,

I am loading flan t5 xxl sharded version using “philschmid/flan-t5-xxl-sharded-fp16” for finetuning. I want to do multi-gpu training using this. After loading the model from AutoModelForSeq2SeqLM, i am setting this parameter setattr(model, ‘model_parallel’, True) but i am getting following error while training using trainer.
AttributeError: ‘T5Stack’ object has no attribute ‘first_device’

Can someone help with this issue?