I want to train T5-XXL on a domain corpus (unsupervised, masked training). I’ve got 2 GPU á 40GB at hand, so I need to make use of Acclerator and Deepspeed.
The example script run_t5_mlm_flax.py from the flax-examples does not support Accelerator
The script run_mlm_no_trainer.py from the pytorch-examples supports Accelerator but not the T5’s particular training mode.
Is it true that huggingface’s Flax support is in “maintenance mode”?