I have used HF for quite some time, tried using the Trainer interface following the example at Fine-tune a pretrained model
It runs fine on Google Colab, however when I try to run it on a Google Cloud instance it stalls at the very beginning after printing:
***** Running training ***** Num examples = 1000 Num Epochs = 3 Instantaneous batch size per device = 8 Total train batch size (w. parallel, distributed & accumulation) = 8 Gradient Accumulation steps = 1 Total optimization steps = 375 Number of trainable parameters = 108314117
No error, nothing happens.
Steps to reproduce:
- Create a compute instance on Google Cloud Platform (e.g. with V100 GPU) with PyTorch 1.13 image
- pip install transformers datasets evaluate
- Try to run this script example.py - JustPaste.it (example code from the tutorial)
Maybe someone has an idea what’s wrong?