Dora training taking 8x time? Why?

ashwani345 · July 24, 2024, 11:14pm

Hi, I am enabling the flag of use_dora in LoRAConfig. When I disable it, the training time is 16 hours and when I enable it, the training time shows 122hours. I have kept all other configs same. What is causing this behaviour?

Training LLAMA 8b instruct.

Following are my lora and trainingArguments:

lora_config:
target_modules: “q_proj,k_proj,v_proj,o_proj,gate_proj”
r: 32
lora_alpha: 16
lora_dropout: 0.05
use_dora: True
init_lora_weights: “gaussian”
use_rslora: True
freeze_layers: 0

train_params:
learning_rate: 0.00003
per_device_train_batch_size: 1
per_device_eval_batch_size: 4
num_train_epochs: 3
gradient_accumulation_steps: 8
max_grad_norm: 1
eval_strategy: “steps”
eval_steps: 0.123
optim: ‘adamw_8bit’
save_steps: 0.123
weight_decay: 0.01
fp16: true
save_strategy: “steps”
warmup_ratio: 0.1
logging_steps: 50
gradient_checkpointing: false
report_to: ‘tensorboard’
lr_scheduler_type: ‘cosine’
save_total_limit: 100
ddp_find_unused_parameters: false

Topic		Replies	Views
Not getting substantial training time improvement with LORA - is this expected? 🤗Transformers	1	680	October 7, 2024
DORA memory requirements Beginners	0	34	September 7, 2024
Training llama2-7b-chat, is my model overfitting? i think my model is not learning anything? how to better train? Beginners	3	609	April 23, 2024
Training CodeLlama2 using LORA doesnt save any memory Beginners	0	701	November 23, 2023
Why is the training time differ? 🤗Accelerate	1	316	June 25, 2024

Dora training taking 8x time? Why?

Related topics