Help with BERT Adapter + LoRA for Multi-Label Classification (301 classes)

Do you have any advice to help guide my research?

I’m not very familiar with NLP itself, so I think I can only help with troubleshooting…:sweat_smile:

Would it make sense to try fine-tuning the model directly without using LoRa?

Yeah. I think so. Bugs aside, using LoRA (PEFT) can change the content and quality of learning for better or worse. Especially when pre-training a model from scratch, it is usually safer to do so without LoRA first.

By the way, I thought that bias due to class overlapping might occur. This is unlikely to be a problem when there are few classes, so it may be one of the causes in this case.