Hi, I’d like to fine-tune BERT for my product catalog corpus which contains a lot of out-of-vocabulary words like brand names. By fine-tuning, I mean transfer learning and not training from scratch.
I have been following this [Fine-tuning a pretrained model — transformers 4.5.0.dev0 documentation] tutorial and see that this requires labels. As you can imagine, my use case if connected to information retrieval and search and does not contain any y_labels. All I want is unique vector embedding out of my trained model.
How should I approach this problem using HF Trainer module?