Hello,
While we train model with HuggingFace Trainer there are several ways to run the training with deepspeed.
We provide the deepspeed config to the trainer as follows (script name = train.py
):
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
training_args = TrainingArguments(
...
deepspeed='ds_config.json'
)
trainer = Trainer(
model=model,
tokenizer=tokenizer,
args=training_args
)
trainer.train()
What is the best way to run that script?
1. python train.py
2. accelerate launch --config_file config.yaml train.py
3. deepspeed train.py
I found option number 3 to be the best since the other options just run out of memory. I suspect I am doing something wrong with options 1 and 2.
Any advice here?
Thanks,
Shon