I am trying to use Horovod Library using pytorch , But I am not able to execute , I am getting error in transformer.trainer script , when i am passing optimizer . i am follwing the same format of code show by horovod docs page using pytorch , still not able to resolve. Any script u can show which uising horovod, so that i can modify one and retry .
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
🤗Transformer with Trainer API on TPU VMs and TPU Pods | 0 | 414 | December 18, 2023 | |
Training using multiple GPUs | 20 | 20150 | February 25, 2024 | |
How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)? | 17 | 18046 | September 6, 2023 | |
How to run single-node, multi-GPU training with HF Trainer? | 5 | 15312 | October 16, 2024 | |
How can I use trainer.train() in runpod's multi gpu? | 0 | 422 | August 4, 2023 |