I have 8 gpus on my machine, I only want to train the model with for first 4 gpus. How can I achieve that?
Try again, but add the os.environ call before you import anything else. So
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3"
from transformers import Trainer
...
If you’re using Jupyter, you can also use magic commands (again, at the top, before importing anything else):
%env CUDA_VISIBLE_DEVICES=0,1,2,3
from transformers import Trainer
...
1 Like
How about in cases where you would want to have additional training steps, and increase the number of devices each time? It looks like this method would need a new instance of each script