T5-xxl mlm distributed training?

DanielDIQA · February 17, 2023, 4:34pm

Hi there,
I want to train T5-XXL on a domain corpus (unsupervised, masked training). I’ve got 2 GPU á 40GB at hand, so I need to make use of Acclerator and Deepspeed.
The example script run_t5_mlm_flax.py from the flax-examples does not support Accelerator

The script run_mlm_no_trainer.py from the pytorch-examples supports Accelerator but not the T5’s particular training mode.

Is it true that huggingface’s Flax support is in “maintenance mode”?

thanks
Daniel

atharsefid · February 15, 2024, 2:26pm

I think accelerator is designed for toch not flax.

Topic		Replies	Views
Training T5 on mlm task from scratch 🤗Transformers	4	3263	July 29, 2022
Pre-training googlebyt5small 🤗Transformers	0	228	October 26, 2022
How to ensure run_t5_mlm_flax.py uses GPU? Beginners	0	320	April 6, 2023
Testing own T5 model 🤗Transformers	0	601	July 10, 2023
Prepare data to fine-tune T5 model on unsupervised objective 🤗Transformers	2	3928	November 3, 2021

T5-xxl mlm distributed training?

Related topics