Fine-tuning with LoRA; can't learn

I am having trouble trying to learn using LoRA with larger models on a custom data set that I am using for classification. By trouble to learn, I mean that I get zeros for my metrics; f1, recall, etc yet my train loss reduces.

I can learn with LoRA, using my same data set with the same models but at smaller sizes. For instance, bert-base-uncased, and albert-v2-base;

Once I jump up to larger models, say albert-v2-xlarge, I get zeros across the board for my evaluation metrics.

What I have tried:

  1. Tested with a smaller batch size on the smaller models to ensure I could learn with the smaller batch size (works well)
  2. Increased lora r and alpha, last tried with 32 for both values

I am wondering if anyone has experience / tips from their trials with custom data sets and LoRA!