Setting specific device for Trainer

Yes.
My code is here

 torch.cuda.set_device(1)
 torch.cuda.current_device()
 1

epochs = 3
training_args = TrainingArguments(
do_predict=True,
output_dir=f'./results',
overwrite_output_dir=True,
do_train=True,
num_train_epochs=epochs,
per_device_train_batch_size=190,
logging_steps=10,
learning_rate=5e-05,
warmup_steps=500, 
save_total_limit = 100,
logging_dir='./logs',
save_steps=50)

training_args.device

out is

 device(type='cuda', index=0)

and

trainer = Trainer(
model=model,
args=training_args,
data_collator=data_collator,
train_dataset=train_dataset)

!nvidia-smi


I can see two GPUs
and nvidia-smi shows that the process is running with GPU #0 still.
It seems that torch.cuda.set_device(1) doens’t work at all.

1 Like