tldr: This is an attempt at using DataParallel class with Huggingface, But I still can’t figure it out. Could you give me some examples ?
Hello, I would like to use my two GPU to make inferences with DataParallel. So I adapted a script which works well on one gpu, but I’m stuck with an error:
from torch.nn.parallel import DataParallel
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM # type: ignore
device = "cuda:0"
tokenizer = AutoTokenizer.from_pretrained("sshleifer/distill-pegasus-cnn-16-4")
model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distill-pegasus-cnn-16-4")
model = DataParallel(model, device_ids=list(range(torch.cuda.device_count())))
model.to(device=device)
batch = tokenizer(
df_chunk["abstract"].to_list(),
truncation=True,
padding="longest",
max_length=max_length_abstract,
return_tensors="pt",
).to(device)
translated = model(**batch) # <--- error here
translated = model(decoder_input_ids=batch["input_ids"], **batch)
print("len token batch", batch["input_ids"].shape)
tldrs = tokenizer.batch_decode(translated, skip_special_tokens=True)
‘DataParallel’ object has no attribute ‘generate’.
So I replaced the faulty line by the following line using the call method of PyTorch models :
translated = model(**batch)
but now I get the following error:
error
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py”, line 61, in _worker
output = module(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/transformers/models/pegasus/modeling_pegasus.py”, line 1390, in forward
outputs = self.model(
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/transformers/models/pegasus/modeling_pegasus.py”, line 1250, in forward
decoder_outputs = self.decoder(
File “/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-packages/torch/nn/modules/module.py”, line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/scratch/scampo01/condaenvs/torchA100_v2/lib/python3.9/site-
packages/transformers/models/pegasus/modeling_pegasus.py", line 1014, in forward
raise ValueError(“You have to specify either decoder_input_ids or decoder_inputs_embeds”)
ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds
So finally I replaced the faulty line with
translated = model(decoder_input_ids=batch["input_ids"], **batch)
And this seems to work, until the tokenizer cannot decode the outputs and I have to modify the end of the script like that to obtain something:
tldrs = tokenizer.batch_decode(translated, skip_special_tokens=True)
tldrs = tokenizer.batch_decode(
torch.argmax(translated.logits, dim=2), skip_special_tokens=True
)
But my results and this code is so messy that it is obvious that I missed something. I didn’t find many (any?) examples on how to use DataParallel with Huggingface models for inferences. Could you give me some pointers and tell me what is wrong in my script?
Thank you