Trainer and Accelerate

What are the differences and if Trainer can do multiple GPU work, why need Accelerate?

Accelerate use only for custom code? (add or remove something)

1 Like

I assume accelerate was added later and has more features like:

"""
Accelerate is a library that enables the same PyTorch code to be run across any distributed configuration by adding just
four lines of code!

tldr; handles all from cpu-gpu(s)-multi-node-tpu-tpu + deepseed + mixprecision in one simple wrapper without complicated
calls e.g. that ddp has to do for multi gpus.

ref: my notes: https://www.evernote.com/shard/s410/sh/f1158fa5-4122-0d17-d6eb-a920461e12b6/g47Qtu6j1F58zvMnJ3fWY8v6pFFWi3I_krn5155UigRUmBzr-D8td5HaQA
"""

related: Trainers.train() with accelerate

The Trainer now uses accelerate as the backbone for it (our work the last few months) so it’s "do you want raw accelerate? Or the Trainer API). The capabilities are the same overall :slight_smile:

3 Likes

just saw this. Pasting this as a ref: Hugging Face Trainer? · Issue #144 · huggingface/accelerate · GitHub

answer:

Since the trainer already has created an accelerator obj inside it’s own code you have to do no code changes except for writing your own accelerate config and calling it as :

accelerate launch --config_file {path/to/config/my_config_file.yaml} {script_name.py} {--arg1} {--arg2} ...

An example config is given at the end.


Long answer

My assumption was that there would be code changes, since every other accelerate tutorial showed that e.g.,

+ from accelerate import Accelerator
  from transformers import AdamW, AutoModelForSequenceClassification, get_scheduler

+ accelerator = Accelerator()

  model = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)
  optimizer = AdamW(model.parameters(), lr=3e-5)

- device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
- model.to(device)

+ train_dataloader, eval_dataloader, model, optimizer = accelerator.prepare(
+     train_dataloader, eval_dataloader, model, optimizer
+ )

  num_epochs = 3
  num_training_steps = num_epochs * len(train_dataloader)
  lr_scheduler = get_scheduler(
      "linear",
      optimizer=optimizer,
      num_warmup_steps=0,
      num_training_steps=num_training_steps
  )

  progress_bar = tqdm(range(num_training_steps))

  model.train()
  for epoch in range(num_epochs):
      for batch in train_dataloader:
-         batch = {k: v.to(device) for k, v in batch.items()}
          outputs = model(**batch)
          loss = outputs.loss
-         loss.backward()
+         accelerator.backward(loss)

          optimizer.step()
          lr_scheduler.step()
          optimizer.zero_grad()
          progress_bar.update(1)

but those code changes are already inside the Trainer. Their integration is so seamless it’s unclear, or perhaps it’s just not in the tutorials so one has to look at their trainer code e.g.,

if is_accelerate_available():
    from accelerate import __version__ as accelerate_version

    if version.parse(accelerate_version) >= version.parse("0.16"):
        from accelerate import skip_first_batches

    from accelerate import Accelerator
    from accelerate.uti

So just make an accelerate config and run it e.g.,

# -----> see this ref: https://huggingface.co/docs/accelerate/package_reference/cli#accelerate-config
# ref for fsdp to know how to change fsdp opts: https://huggingface.co/docs/accelerate/usage_guides/fsdp
# ref for accelerate to know how to change accelerate opts: https://huggingface.co/docs/accelerate/basic_tutorials/launch
# ref alpaca accelerate config: https://github.com/tatsu-lab/alpaca_farm/tree/main/examples/accelerate_configs

main_training_function: main  # <- change

deepspeed_config: { }
distributed_type: FSDP
downcast_bf16: 'no'
dynamo_backend: 'NO'
# seems alpaca was based on: https://huggingface.co/docs/accelerate/usage_guides/fsdp
fsdp_config:
  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
  fsdp_backward_prefetch_policy: BACKWARD_PRE
  fsdp_offload_params: false
  fsdp_sharding_strategy: 1
  fsdp_state_dict_type: FULL_STATE_DICT
  #  fsdp_transformer_layer_cls_to_wrap: LlamaDecoderLayer  # <-change
  fsdp_transformer_layer_cls_to_wrap: FalconDecoderLayer  # <-change
#  fsdp_min_num_params:  7e9 # e.g., suggested heuristic: num_params / num_gpus = params/gpu, multiply by precision in bytes to know GBs used
gpu_ids: null
machine_rank: 0
main_process_ip: null
main_process_port: null
megatron_lm_config: { }
#mixed_precision: 'bf16'
#mixed_precision: 'no'
num_machines: 1
num_processes: 4
rdzv_backend: static
same_network: true
tpu_name: null
tpu_zone: null
use_cpu: false
1 Like

Hi can someone help me out?

I am working in Google Colab to learn how to finetune a ViT model, following this tutorial Vision Transformers (ViT) Explained + Fine-tuning in Python - YouTube.

Trying to define the following arguments

  • training and testing dataset
  • feature extractor
  • model
  • collate function
  • evaluation metric

But I have error defining TrainingArguments and Trainer, where it says
ImportError: Using the Trainer with PyTorch requires accelerate>=0.20.1: Please run pip install transformers[torch] or pip install accelerate -U

Where do I need to add accelerate? And how do I incorporate it with this code?

Have you solved this? I meet the same question:

After installing you need to restart the runtime so it loads the newer version

Thanks, I changed the runtime type to GPU and do not change any code, then the code is works.
The code still didn’t work on CPU.
Do you know why?

On the CPU version, do pip install accelerate -U, make sure it shows the latest, and do Runtime → Restart Runtime, then run the code again (skipping the install). Let me know if this still errors out

:hugs: Thanks, It works on CPU as you say.

1 Like

@muellerzr if I want to use only fsdp, do I need HF accelerate? how would I run my script?