Hey,
Recently I tried to finetune the Llama8B model use the Trainer. I noticed that when trainer start, it automatically split the model’s weights and distributes those weight on different GPUs, I have eight GPU for now, but I’ld like just use 2 or 3 of them. Instead of seting CUDA_VISIBLE_DEVICE env arg, is there any other solution? I go through the TrainArguments ooptions but seem found no clue