I’m trying to fine-tune a MiniLM sentence transformer model with two larger (400k, 1.5M) training datasets (anchor, positive, negative triplets) to measure the impact of the amount of domain knowledge fed into the model, eg. how much domain knowledge is needed to gain X amount of score increase in model accuracy.
The base model performs pretty well on downstream tasks (~70%), but I’d like to fine-tune the scores further to increase them to 80-90%.
Unfortunately after numerous tries, the resulting model’s performance still drops to 50-60%. I’ve tried increasing/decreasing epochs, batchSizes, learning rates, changing the loss function from Triplet to MNR Loss, etc, but nothing really helps at least keep the original scores.
Is this an overfitting issue? Or something else causes this? Would it be easier to fine-tune a pretrained model like nreimers/MiniLM-L6-H384-uncased · Hugging Face?
Thank you in advance,