How to use trainer with deepspeed

Kason123 · January 12, 2024, 2:05am

Hi, I am trying to use deepspeed along with hugginface trainer. For that I simply used the following code. However, this code gives the following error: RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! . Can someone help me to resolve this issue?

training_args = TrainingArguments(
    output_dir=outputdir,
    per_device_train_batch_size=script_args.batch_size,
    deepspeed= "deepspeed_config.json",
    bf16=True,
    bf16_full_eval=True,
    gradient_accumulation_steps=script_args.gradient_accumulation_steps,
    learning_rate=script_args.learning_rate,
    logging_steps=script_args.logging_steps,
    num_train_epochs=script_args.num_train_epochs,
    max_steps=script_args.max_steps,
    report_to=script_args.log_with,
    save_steps=script_args.save_steps,
    save_total_limit=script_args.save_total_limit,
    push_to_hub=script_args.push_to_hub,
    hub_model_id=script_args.hub_model_id,
)

trainer = Trainer(
    model,
    training_args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets['eval'],
    tokenizer=tokenizer,
)

Topic		Replies	Views
Question about using trainer with DeepSpeed 🤗Transformers	0	451	April 25, 2023
Best practice to run DeepSpeed DeepSpeed	2	1559	December 25, 2023
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! 🤗Accelerate	1	760	May 31, 2024
Using multi GPU with Trainer through Deepspeed, parameters found on cpu Beginners	0	1049	August 9, 2023
Issues with using DeepSpeed on multiple GPUs DeepSpeed	2	2540	September 9, 2022

How to use trainer with deepspeed

Related topics