Best practice to run DeepSpeed

shon711 · February 14, 2023, 7:47am

Hello,

While we train model with HuggingFace Trainer there are several ways to run the training with deepspeed.

We provide the deepspeed config to the trainer as follows (script name = train.py):

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

training_args = TrainingArguments(
    ...
    deepspeed='ds_config.json'
)

trainer = Trainer(
        model=model,
        tokenizer=tokenizer,
        args=training_args
)
trainer.train()

What is the best way to run that script?

1. python train.py
2. accelerate launch --config_file config.yaml train.py
3. deepspeed train.py

I found option number 3 to be the best since the other options just run out of memory. I suspect I am doing something wrong with options 1 and 2.

Any advice here?

Thanks,
Shon

Kenkentron · August 6, 2023, 10:19pm

Did you manage to find a good answer? I am having the same doubt what is the best thing to do.

Thanks!

SantoshScienceIO · December 25, 2023, 1:53am

I’m also interested in the answer, though it’s my understanding so far that the first one probably won’t work.

Topic		Replies	Views
How to use trainer with deepspeed Beginners	0	340	January 12, 2024
Using deepspeed script launcher vs accelerate script launcher for TRL 🤗Accelerate	4	1909	January 24, 2024
Deepspeed script launcher vs accelerate script launcher for TRL DeepSpeed	0	368	December 25, 2023
Basics for Multi GPU Training with Huggingface Trainer 🤗Transformers	0	2688	June 14, 2023
Accelerate config in Seq2SeqTrainer 🤗Accelerate	0	148	June 17, 2024

Best practice to run DeepSpeed

Related topics