LoRA assumes a relatively “well-behaved” base model. If the base isn’t instruction-tuned or capable in your task domain, LoRA doesn’t get enough leverage to shift it into useful territory, especially without supervised signals.
Try increasing the batch size and r and decreasing lr. You could try an instruction model. You could warm up the base model with some “pre - training” … Also look into QLora, it is a bit different .