Data Parallel Multi GPU Inference

muellerzr · May 26, 2023, 12:27pm

You just move the model to the device. Check out the new distributed inference tutorial, and install accelerate from dev to make use of the new API if you want to do split_by_processes. Otherwise pass your dataloader to Accelerator.prepare and do model.to(state.device):

Using the DDP wrapper on your model is only relevant when you want to update the gradients (that’s what it’s designed there for), so inference just load the model on the device normally

Topic		Replies	Views
How to do distributed Inference for large models with multiprocess? 🤗Accelerate	3	625	May 26, 2024
Accelerator OOM 🤗Accelerate	2	1256	July 5, 2023
Multi-gpu inference Beginners	2	802	May 14, 2024
Inflated GPU memory footprint of model prepared via accelerate 🤗Accelerate	5	761	September 15, 2023
Data Parallelism for multi-GPUs Inference Intermediate	0	544	October 26, 2022

Data Parallel Multi GPU Inference

Related topics