Not getting substantial training time improvement with LORA - is this expected?

Hi all.

I am not getting a substantial training time improvement with LORA
Details :
My dataset size is ~2M training samples

  1. When I am fine-tuning roberta-base model ( base model has ~125M trainable parameters ) training time is roughly 30.5hrs
  2. When I am fine-tuning roberta-base with LORA ( r=8, lora_alpha=8, ~2M trainable parameters ) training time is roughly 29hrs
  • Batch size, number of GPUs, number of epochs is same for both 1 and 2

Are we guaranteed to observe a substantial improvement in training time when using LORA on large datasets ?

Can anyone please guide regarding this