Hi,
I have used HF for quite some time, tried using the Trainer interface following the example at Fine-tune a pretrained model
It runs fine on Google Colab, however when I try to run it on a Google Cloud instance it stalls at the very beginning after printing:
***** Running training *****
Num examples = 1000
Num Epochs = 3
Instantaneous batch size per device = 8
Total train batch size (w. parallel, distributed & accumulation) = 8
Gradient Accumulation steps = 1
Total optimization steps = 375
Number of trainable parameters = 108314117
No error, nothing happens.
Steps to reproduce:
- Create a compute instance on Google Cloud Platform (e.g. with V100 GPU) with PyTorch 1.13 image
- pip install transformers datasets evaluate
- Try to run this script example.py - JustPaste.it (example code from the tutorial)
Maybe someone has an idea what’s wrong?