Trying to fine tune llama-7b following this tutorial (GPT4ALL: Train with local data for Fine-tuning | by Mark Zhou | Medium).
My accelerate configuration:
$ accelerate env
[2023-08-20 19:22:40,268] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Copy-and-paste the text below in your GitHub issue
- `Accelerate` version: 0.21.0
- Platform: Linux-5.15.90.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.12
- Numpy version: 1.25.2
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- PyTorch XPU available: False
- PyTorch NPU available: False
- System RAM: 15.49 GB
- GPU type: NVIDIA GeForce RTX 4070 Laptop GPU
- `Accelerate` default config:
- compute_environment: LOCAL_MACHINE
- distributed_type: DEEPSPEED
- mixed_precision: fp16
- use_cpu: True
- num_processes: 1
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- deepspeed_config: {'gradient_accumulation_steps': 1, 'gradient_clipping': 1.0, 'offload_optimizer_device': 'cpu', 'offload_param_device': 'cpu', 'zero3_init_flag': False, 'zero_stage': 0}
- ipex_config: {'ipex': True}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
However, every time I try training I get this error:
accelerate launch --cpu --num_processes=8 --num_machines=1 --machine_rank=0 train.py --config configs/train/finetune_frog.yaml
[2023-08-20 19:24:08,017] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
The following values were not passed to `accelerate launch` and had defaults used instead:
`--mixed_precision` was set to a value of `'no'`
`--dynamo_backend` was set to a value of `'no'`
`--num_cpu_threads_per_process` was set to `12` to improve out-of-box performance when training on CPUs
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
[2023-08-20 19:24:11,530] [INFO] [real_accelerator.py:133:get_accelerator] Setting ds_accelerator to cuda (auto detect)
wandb: Currently logged in as: wung8. Use `wandb login --relogin` to force relogin
wandb: Tracking run with wandb version 0.15.8
wandb: Run data is saved locally in /mnt/c/Users/micha/Documents/CS Projects/frothy_mug/gpt4all/gpt4all-training/wandb/run-20230820_192415-wtr65d09
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run spring-firebrand-34
wandb: βοΈ View project at https://wandb.ai/wung8/gpt4all-gpt4all-training
wandb: π View run at https://wandb.ai/wung8/gpt4all-gpt4all-training/runs/wtr65d09
{'model_name': 'decapoda-research/llama-7b-hf', 'tokenizer_name': 'gpt2', 'gradient_checkpointing': True, 'save_name': 'nomic-ai/gpt4all-full-multi-turn', 'streaming': False, 'num_proc': 64, 'dataset_path': 'training.jsonl', 'max_length': 1024, 'batch_size': 1, 'lr': 2e-05, 'min_lr': 0, 'weight_decay': 0.0, 'eval_every': 500, 'eval_steps': 105, 'save_every': 500, 'log_grads_every': 100, 'output_dir': 'model_data/gpt4all_frog', 'checkpoint': None, 'lora': False, 'warmup_steps': 500, 'num_epochs': 2, 'wandb': True, 'wandb_entity': None, 'wandb_project_name': None, 'seed': 42}
Using 1 GPUs
Using pad_token, but it is not set yet.
Reading files ['training.jsonl']
num_proc must be <= 21. Reducing num_proc to 21 for dataset of size 21.
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 33/33 [00:48<00:00, 1.47s/it]Len of train_dataloader: 387
Traceback (most recent call last):
File "/mnt/c/Users/micha/Documents/CS Projects/frothy_mug/gpt4all/gpt4all-training/train.py", line 241, in <module>
train(accelerator, config=config)
File "/mnt/c/Users/micha/Documents/CS Projects/frothy_mug/gpt4all/gpt4all-training/train.py", line 91, in train
total_num_steps = (len(train_dataloader) / gradient_accumulation_steps) * (config["num_epochs"])
UnboundLocalError: local variable 'gradient_accumulation_steps' referenced before assignment
Traceback (most recent call last):
File "/mnt/c/Users/micha/Documents/CS Projects/frothy_mug/gpt4all/gpt4all-training/train.py", line 241, in <module>
train(accelerator, config=config)
File "/mnt/c/Users/micha/Documents/CS Projects/frothy_mug/gpt4all/gpt4all-training/train.py", line 91, in train
total_num_steps = (len(train_dataloader) / gradient_accumulation_steps) * (config["num_epochs"])
UnboundLocalError: local variable 'gradient_accumulation_steps' referenced before assignment
Iβve tried adding --config (path) and --gradient accumulation steps=1 but I get the same result.
From what I can tell itβs an issue with the deepspeed plugin which for some reason doesnβt exist.
>>> from accelerate import Accelerator
>>> accelerator = Accelerator()
>>> str(accelerator.state.deepspeed_plugin)
'None'
Is it possible that this is just a Windows issue or have I forgotten to do something?