Distributed inference on multiple files

vikalex · March 8, 2022, 3:46pm

Reproducing the issue from github Deadlock when loading the model in multiprocessing context · Issue #15976 · huggingface/transformers · GitHub

I am using the following snippet

import torch
from pathlib import Path
import multiprocessing as mp
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

queue = mp.Queue()


def load_model(filename):
    device = queue.get()
    print('Loading')
    model = AutoModelForSeq2SeqLM.from_pretrained('models/sqgen').to(device)
    print('Loaded')
    queue.put(device)


def parallel():
    num_gpus = torch.cuda.device_count()

    with mp.get_context('spawn').Pool(processes=num_gpus) as pool:
        for gpu_id in range(num_gpus):
            queue.put('cuda:{0}'.format(gpu_id))
        pool = mp.Pool(processes=num_gpus)
        flist = list(Path('data').glob('*.json'))
        pool.map(
            load_model,
            flist,
        )
        pool.close()
        pool.join()


if __name__ == '__main__':
    parallel()

This just hangs when loading the model. This is minimal example I cooked up to demonstrate the issue.

What I am actually doing is that, I have 16 large files (possibly more) and 8 GPUs, so I am trying to assign each file to a GPU and do the inference in parallel 8 processes at a time to use all GPUs simultaneously.

Why is this issue happening? Why does model loading deadlock?
What’s the right way to do what I want to achieve?

Gian-hf · January 22, 2023, 2:20pm

Hi, it may be too late but this answer your question:

Topic		Replies	Views
Loading a HF Model in Multiple GPUs and Run Inferences in those GPUs 🤗Accelerate	10	9612	October 16, 2024
How to parallelize inference on a quantized model Intermediate	5	250	October 7, 2024
How to do distributed Inference for large models with multiprocess? 🤗Accelerate	3	633	May 26, 2024
Distributed Inference on GPT-2 Beginners	2	231	May 2, 2024
Multiple gpu not properly parallelized during model.generate() 🤗Transformers	4	1623	October 9, 2022

Distributed inference on multiple files

Related topics