I am having trouble trying to learn using LoRA with larger models on a custom data set that I am using for classification. By trouble to learn, I mean that I get zeros for my metrics; f1, recall, etc yet my train loss reduces.
I can learn with LoRA, using my same data set with the same models but at smaller sizes. For instance, bert-base-uncased, and albert-v2-base;
Once I jump up to larger models, say albert-v2-xlarge, I get zeros across the board for my evaluation metrics.
What I have tried:
- Tested with a smaller batch size on the smaller models to ensure I could learn with the smaller batch size (works well)
- Increased lora r and alpha, last tried with 32 for both values
I am wondering if anyone has experience / tips from their trials with custom data sets and LoRA!