What are the differences and if Trainer can do multiple GPU work, why need Accelerate?
Accelerate use only for custom code? (add or remove something)
What are the differences and if Trainer can do multiple GPU work, why need Accelerate?
Accelerate use only for custom code? (add or remove something)
I assume accelerate was added later and has more features like:
"""
Accelerate is a library that enables the same PyTorch code to be run across any distributed configuration by adding just
four lines of code!
tldr; handles all from cpu-gpu(s)-multi-node-tpu-tpu + deepseed + mixprecision in one simple wrapper without complicated
calls e.g. that ddp has to do for multi gpus.
ref: my notes: https://www.evernote.com/shard/s410/sh/f1158fa5-4122-0d17-d6eb-a920461e12b6/g47Qtu6j1F58zvMnJ3fWY8v6pFFWi3I_krn5155UigRUmBzr-D8td5HaQA
"""
related: Trainers.train() with accelerate
The Trainer
now uses accelerate
as the backbone for it (our work the last few months) so itâs "do you want raw accelerate? Or the Trainer
API). The capabilities are the same overall
just saw this. Pasting this as a ref: Hugging Face Trainer? ¡ Issue #144 ¡ huggingface/accelerate ¡ GitHub
answer:
Since the trainer already has created an accelerator obj inside itâs own code you have to do no code changes except for writing your own accelerate config and calling it as :
accelerate launch --config_file {path/to/config/my_config_file.yaml} {script_name.py} {--arg1} {--arg2} ...
An example config is given at the end.
My assumption was that there would be code changes, since every other accelerate tutorial showed that e.g.,
+ from accelerate import Accelerator
from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler
+ accelerator = Accelerator()
model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
optimizer = AdamW(model.parameters(), lr=3e-5)
- device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
- model.to(device)
+ train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
+ train_dataloader, eval_dataloader, model, optimizer
+ )
num_epochs = 3
num_training_steps = num_epochs * len(train_dataloader)
lr_scheduler = get_scheduler(
"linear",
optimizer=optimizer,
num_warmup_steps=0,
num_training_steps=num_training_steps
)
progress_bar = tqdm(range(num_training_steps))
model.train()
for epoch in range(num_epochs):
for batch in train_dataloader:
- batch = {k: v.to(device) for k, v in batch.items()}
outputs = model(**batch)
loss = outputs.loss
- loss.backward()
+ accelerator.backward(loss)
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
progress_bar.update(1)
but those code changes are already inside the Trainer. Their integration is so seamless itâs unclear, or perhaps itâs just not in the tutorials so one has to look at their trainer code e.g.,
if is_accelerate_available():
from accelerate import __version__ as accelerate_version
if version.parse(accelerate_version) >= version.parse("0.16"):
from accelerate import skip_first_batches
from accelerate import Accelerator
from accelerate.uti
So just make an accelerate config and run it e.g.,
# -----> see this ref: https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config
# ref for fsdp to know how to change fsdp opts: https://huggingface.co/docs/accelerate/usage_guides/fsdp
# ref for accelerate to know how to change accelerate opts: https://huggingface.co/docs/accelerate/basic_tutorials/launch
# ref alpaca accelerate config: https://github.com/tatsu-lab/alpaca_farm/tree/main/examples/accelerate_configs
main_training_function: main # <- change
deepspeed_config: { }
distributed_type: FSDP
downcast_bf16: 'no'
dynamo_backend: 'NO'
# seems alpaca was based on: https://huggingface.co/docs/accelerate/usage_guides/fsdp
fsdp_config:
fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
fsdp_backward_prefetch_policy: BACKWARD_PRE
fsdp_offload_params: false
fsdp_sharding_strategy: 1
fsdp_state_dict_type: FULL_STATE_DICT
# fsdp_transformer_layer_cls_to_wrap: LlamaDecoderLayer # <-change
fsdp_transformer_layer_cls_to_wrap: FalconDecoderLayer # <-change
# fsdp_min_num_params: 7e9 # e.g., suggested heuristic: num_params / num_gpus = params/gpu, multiply by precision in bytes to know GBs used
gpu_ids: null
machine_rank: 0
main_process_ip: null
main_process_port: null
megatron_lm_config: { }
#mixed_precision: 'bf16'
#mixed_precision: 'no'
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_name: null
tpu_zone: null
use_cpu: false
Hi can someone help me out?
I am working in Google Colab to learn how to finetune a ViT model, following this tutorial Vision Transformers (ViT) Explained + Fine-tuning in Python - YouTube.
Trying to define the following arguments
But I have error defining TrainingArguments and Trainer, where it says
ImportError: Using the Trainer
with PyTorch
requires accelerate>=0.20.1
: Please run pip install transformers[torch]
or pip install accelerate -U
Where do I need to add accelerate? And how do I incorporate it with this code?
Have you solved this? I meet the same question:
After installing you need to restart the runtime so it loads the newer version
Thanks, I changed the runtime type to GPU and do not change any code, then the code is works.
The code still didnât work on CPU.
Do you know why?
On the CPU version, do pip install accelerate -U
, make sure it shows the latest, and do Runtime â Restart Runtime, then run the code again (skipping the install). Let me know if this still errors out
Thanks, It works on CPU as you say.
@muellerzr if I want to use only fsdp, do I need HF accelerate? how would I run my script?
I have set accelerate configuration using âaccelerate configâ in bash. Now does Trainer create some other accelerate configuration by its own?