Recommended Approach for Distributed Inference

cambonator · July 29, 2022, 2:01pm

I am looking to run inference with optimum in a distributed setting with PyTorch (multi-node, multi-CPU/GPU). Is there a recommended approach to do this? My data is coming from a HF Datasets object.

I tried using this solution with the HF Trainer, but it gives me an error when I run it with an optimum model (the optimum model does not have an eval() function).

philschmid · July 30, 2022, 7:12am

You could have multiple ORTModelForXXX classes and each one could use a different device and then iterate over your dataset either sync or async with a queue

cambonator · August 1, 2022, 7:55pm

So in this solution would I be using something like torch.distributed to manage this process and aggregate the prediction results? Are there any code examples you could point me to?

philschmid · August 1, 2022, 8:34pm

Or just python threads. There is no need for torch.distributed. No we don’t have any code example for that.

Topic		Replies	Views
Inference on Multi-GPU/multinode Beginners	4	7552	January 12, 2023
Optimum vs Accelerate 🤗Optimum	5	1178	March 2, 2023
How to run an end to end example of distributed data parallel with hugging face's trainer api (ideally on a single node multiple gpus)? Intermediate	17	17946	September 6, 2023
HF Trainer downstream evaluation on multiple GPUS 🤗Transformers	1	1084	December 21, 2022
Boilerplate for Trainer using torch.distributed Beginners	4	2050	January 11, 2022

Recommended Approach for Distributed Inference

Related topics