I tried to generate summary for CNN/DM or XSUM using prophetnet
by running the following code: (based on the codes from https://github.com/huggingface/transformers/tree/master/examples/seq2seq)
$ export DATA=cnndm
$ export DATA_DIR=data/$DATA
$ export OUTPUT_DIR=output/$DATA-prophetnet
$ python -m torch.distributed.launch --nproc_per_node=2 run_distributed_eval.py \
--model_name microsoft/prophetnet-large-uncased-cnndm \
--save_dir $OUTPUT_DIR \
--data_dir $DATA_DIR \
--bs 32 \
--task summarization_cnndm
Then I received the following error messages:
Index < srcSelectDimSize` failed.
0%| | 0/180 [00:01<?, ?it/s]
Traceback (most recent call last):
File "run_distributed_eval.py", line 281, in <module>
run_generate()
File "run_distributed_eval.py", line 213, in run_generate
**generate_kwargs,
File "run_distributed_eval.py", line 123, in eval_data_dir
**generate_kwargs,
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 26, in decorate_context
return func(*args, **kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/transformers/generation_utils.py", line 483, in generate
model_kwargs = self._prepare_encoder_decoder_kwargs_for_generation(input_ids, model_kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/transformers/generation_utils.py", line 85, in _prepare_encoder_decoder_kwargs_for_generation
model_kwargs["encoder_outputs"]: ModelOutput = encoder(input_ids, return_dict=True, **encoder_kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/transformers/models/prophetnet/modeling_prophetnet.py", line 1225, in forward
hidden_states, attn_probs = encoder_layer(hidden_states, attention_mask=extended_attention_mask)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/transformers/models/prophetnet/modeling_prophetnet.py", line 1051, in forward
attention_mask=attention_mask,
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/transformers/models/prophetnet/modeling_prophetnet.py", line 652, in forward
query_states = self.query_proj(hidden_states) / (self.head_dim ** 0.5)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 93, in forward
return F.linear(input, self.weight, self.bias)
File "/home/rachelzheng/acl/venv/lib/python3.6/site-packages/torch/nn/functional.py", line 1692, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /pytorch/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8
terminate called after throwing an instance of 'std::runtime_error'
what(): NCCL error in: /pytorch/torch/lib/c10d/../c10d/NCCLUtils.hpp:136, unhandled cuda error, NCCL version 2.7.8