Use this category for any question or feedback related to the *Natural Language Processing with Transformers* book.

Hi Sylvain - I purchased your Transformers Book and been reading it the past few days. It’s a great deep dive on the platform. I was testing out the colab notebook for Chapter 6 on summarization and I’ve hit a few errors I can’t seem to fix for some reason. When running the fine-tuning code (i.e., trainer.train()) with the notebook code exactly as written, I end up with error A below:

#
=============

ERROR A // RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)

The error is fixed when I edit the line “dataset_samsum_pt.set_format(type=“torch”, columns=columns)” to “dataset_samsum_pt.set_format(type=“torch”, columns=columns, device=device)” where device is equal to ‘cuda’, but then I end up with the error B below. Can you provide some guidance?

Thanks a lot!

R

##
ERROR B: ===================================

The following columns in the training set don’t have a corresponding argument in `PegasusForConditionalGeneration.forward`

and have been ignored: id, summary, dialogue.

***** Running training *****

Num examples = 14732

Num Epochs = 1

Instantaneous batch size per device = 1

Total train batch size (w. parallel, distributed & accumulation) = 16

Gradient Accumulation steps = 16

Total optimization steps = 920

TypeError Traceback (most recent call last)

in ()

1 # hide_output

----> 2 trainer.train()

3 score = evaluate_summaries_pegasus(

4 dataset_samsum[“test”], rouge_metric, trainer.model, tokenizer,

5 batch_size=2, column_text=“dialogue”, column_summary=“summary”)

5 frames

<**array_function** internals> in concatenate(*args, **kwargs)

/usr/local/lib/python3.7/dist-packages/torch/_tensor.py in **array**(self, dtype)

755 return handle_torch_function(Tensor.**array**, (self,), self, dtype=dtype)

756 if dtype is None:

→ 757 return self.numpy()

758 else:

759 return self.numpy().astype(dtype, copy=False)

TypeError: can’t convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.