Best practice to run DeepSpeed

Hello,

While we train model with HuggingFace Trainer there are several ways to run the training with deepspeed.

We provide the deepspeed config to the trainer as follows (script name = train.py):

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

training_args = TrainingArguments(
    ...
    deepspeed='ds_config.json'
)

trainer = Trainer(
        model=model,
        tokenizer=tokenizer,
        args=training_args
)
trainer.train()

What is the best way to run that script?

1. python train.py
2. accelerate launch --config_file config.yaml train.py
3. deepspeed train.py

I found option number 3 to be the best since the other options just run out of memory. I suspect I am doing something wrong with options 1 and 2.

Any advice here?

Thanks,
Shon

2 Likes

Did you manage to find a good answer? I am having the same doubt what is the best thing to do.

Thanks!

I’m also interested in the answer, though it’s my understanding so far that the first one probably won’t work.