Dora training taking 8x time? Why?

Hi, I am enabling the flag of use_dora in LoRAConfig. When I disable it, the training time is 16 hours and when I enable it, the training time shows 122hours. I have kept all other configs same. What is causing this behaviour?

Training LLAMA 8b instruct.

Following are my lora and trainingArguments:

lora_config:
target_modules: “q_proj,k_proj,v_proj,o_proj,gate_proj”
r: 32
lora_alpha: 16
lora_dropout: 0.05
use_dora: True
init_lora_weights: “gaussian”
use_rslora: True
freeze_layers: 0

train_params:
learning_rate: 0.00003
per_device_train_batch_size: 1
per_device_eval_batch_size: 4
num_train_epochs: 3
gradient_accumulation_steps: 8
max_grad_norm: 1
eval_strategy: “steps”
eval_steps: 0.123
optim: ‘adamw_8bit’
save_steps: 0.123
weight_decay: 0.01
fp16: true
save_strategy: “steps”
warmup_ratio: 0.1
logging_steps: 50
gradient_checkpointing: false
report_to: ‘tensorboard’
lr_scheduler_type: ‘cosine’
save_total_limit: 100
ddp_find_unused_parameters: false