Hi there,
I want to train T5-XXL on a domain corpus (unsupervised, masked training). Iâve got 2 GPU á 40GB at hand, so I need to make use of Acclerator and Deepspeed.
The example script run_t5_mlm_flax.py from the flax-examples does not support Accelerator
The script run_mlm_no_trainer.py from the pytorch-examples supports Accelerator but not the T5âs particular training mode.
Is it true that huggingfaceâs Flax support is in âmaintenance modeâ?
thanks
Daniel