Inference on Multi-GPU/multinode

IdoAmit198 · December 22, 2022, 7:35am

As far as I know, it does.
It should work all the same, but without the need to initialize an optimizer, scheduler, etc, using the accelerator, and only init the device, eval_dataloader, model with the accelerator.

In case it won’t work for for some reason there are more other wrappers to run distributed inference with (which also give a speed up), such as Optimum (made to accelerate inference).

In addition it’s worth to mention you can always do it the “hard” way and implement stuff with torch.nn.DataParallel or with torch.nn.parallel.DistributedDataParallel.
Then you can run your code via the torchrun console script. But again, I personally find this method harder than a wrapper like accelerate or Optimum.

Topic		Replies	Views
How to run inference on multigpus 🤗Accelerate	0	146	November 29, 2024
Loading a HF Model in Multiple GPUs and Run Inferences in those GPUs 🤗Accelerate	10	9731	October 16, 2024
Recommended Approach for Distributed Inference 🤗Optimum	3	2010	August 1, 2022
How to generate on multiple GPU's Intermediate	3	1872	August 30, 2022
Multi node CPU to train transformer GPT-JT-6B-v1 (moved) 🤗Transformers	0	424	February 20, 2023

Inference on Multi-GPU/multinode

Related topics