Domain specific fine tuning

philippsc · September 8, 2022, 8:16am

I want to use a pretrained Transformer to do sentence similarity in a specific domain (automotive). I have a domain specific ontology and want to match parameters from different xml data with the ontology entries.
From what ive learned there are many possibilities to do finetuning on models. But im still not sure what the “best” procedure would be in my case.

a. Finetune it on unlabeled in-domain text data
b. Finetune it on labeled sentencepairs but not on this target domain (lack of data in this domain)
or c.:take every entry of the ontology and do data augmentation with WordNet and co. Then take the new data and manually label it and use the labeled in-domain data as finetune input. But because of my lack of experience in practical NLP im not sure if this could produce better results then standard bert and if there could be a bias?

As you can tell im new to NLP. Maybe there is someone with similar challenges.

Topic		Replies	Views
Fine-tuning BERT Model on domain specific language Models	1	1798	January 5, 2021
How to finetune BERT on an ontology? Beginners	0	465	March 10, 2022
Identifying and getting right embeddings from the fine tuned BERT on domain specific data Intermediate	0	1331	September 8, 2021
Fine-tuning BERT Model on domain specific language and for classification 🤗Transformers	7	8428	November 14, 2024
Fine tuning a sentence-transformer for cosine sim on 500k sentence pairs without labels-- advice 🤗Transformers	2	1203	April 20, 2024

Domain specific fine tuning

Related topics