Inferences with DataParallel

tldr: This is an attempt at using DataParallel class with Huggingface, But I still can’t figure it out. Could you give me some examples ?

Hello, I would like to use my two GPU to make inferences with DataParallel. So I adapted a script which works well on one gpu, but I’m stuck with an error:

from torch.nn.parallel import DataParallel
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM  # type: ignore


device = "cuda:0"
tokenizer = AutoTokenizer.from_pretrained("sshleifer/distill-pegasus-cnn-16-4")
model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distill-pegasus-cnn-16-4")

model = DataParallel(model, device_ids=list(range(torch.cuda.device_count())))
model.to(device=device)

batch = tokenizer(
      df_chunk["abstract"].to_list(),
      truncation=True,
      padding="longest",
      max_length=max_length_abstract,
      return_tensors="pt",
  ).to(device)
  translated = model(**batch) # <--- error here
  translated = model(decoder_input_ids=batch["input_ids"], **batch)
  print("len token batch", batch["input_ids"].shape)
  tldrs = tokenizer.batch_decode(translated, skip_special_tokens=True)

‘DataParallel’ object has no attribute ‘generate’.

So I replaced the faulty line by the following line using the call method of PyTorch models :

translated = model(**batch) but now I get the following error:

error

ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py”, line 61, in _worker
output = module(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/transformers/models/pegasus/modeling_pegasus.py”, line 1390, in forward
outputs = self.model(
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/transformers/models/pegasus/modeling_pegasus.py”, line 1250, in forward
decoder_outputs = self.decoder(
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-

packages/transformers/models/pegasus/modeling_pegasus.py", line 1014, in forward
raise ValueError(“You have to specify either decoder_input_ids or decoder_inputs_embeds”)
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

So finally I replaced the faulty line with

translated = model(decoder_input_ids=batch["input_ids"], **batch)

And this seems to work, until the tokenizer cannot decode the outputs and I have to modify the end of the script like that to obtain something:

    tldrs = tokenizer.batch_decode(translated, skip_special_tokens=True)
    tldrs = tokenizer.batch_decode(
        torch.argmax(translated.logits, dim=2), skip_special_tokens=True
    )

But my results and this code is so messy that it is obvious that I missed something. I didn’t find many (any?) examples on how to use DataParallel with Huggingface models for inferences. Could you give me some pointers and tell me what is wrong in my script?

Thank you

1 Like

I happen to come across the same question. Did you find a more elegant way to solve it? It seems that if you replace model.generate(batch["input_ids"]) with model(decoder_input_ids=batch["input_ids"],**batch) and tldrs = tokenizer.batch_decode(torch.argmax(translated.logits, dim=2)), then you are performing argmax decoding. You cannot directly use packaged decoding strategy such as beam search in model.generate(), which is not convenient…

since the the model variable is a DataParrallel instance, it attributes are accessible with model.module
have you tried
translated = model.module(**batch)
?

accelerator.unwrap_model(model) worked instead of model.module.