How can I speedup T5 load?

vedantroy · September 25, 2024, 9:57pm

I currently load T5 with:

self.t5_enc = T5EncoderModel.from_pretrained(T5_MODEL).eval().to(self.device)

But, I’m not sure if this using optimizations like torch.nn.utils.skip_init, and loading a FSDP sharded-checkpoint for maximum loading speed.

How can I ensure these optimizations are being enabled?

John6666 · September 26, 2024, 2:00am

I’ve never used FDSP so I don’t know…

Topic		Replies	Views
How to avert 'loading checkpoint shards'? 🤗Transformers	4	12539	November 1, 2024
While training a T5Small model using FSDP, the model does not learn 🤗Accelerate	1	845	April 15, 2024
Difficulty with checkpoint saving and loading (trainer+ FSDP accelerate) Beginners	0	561	April 1, 2024
How to parallelize inference on a quantized model Intermediate	5	250	October 7, 2024
Issues with Dataset Loading and Checkpoint Saving using FSDP with HuggingFace Trainer on SLURM Multi-Node Setup 🤗Accelerate	1	103	April 7, 2025