Anomaly Detection / Out of Domain Detection with BERT

hanshupe · July 26, 2022, 7:01am

Are there any best-practices how to detect if a document is out of the domain, a fine-tuned BERT model was trained on? One idea is to perform an anomaly detection before applying the fine-tuned model. The anomaly detection could be a 1-class SVM, or an autoencoder based on SBERT embeddings. Another way would be adding an “Other” class to the classification model, but it would probably be highly imbalanced.

I am wondering if there are recommended approach for this common real-world problem?

Topic		Replies	Views
BERT Once-Class Fine-Tuning Beginners	0	282	July 25, 2022
Using EXTREMELY small dataset to finetune BERT 🤗Transformers	6	13144	February 1, 2023
Dataset for fake news detection, fine tune or pre-train Beginners	7	1731	October 12, 2020
Pretraining Models from Scratch vs Further Training 🤗Transformers	0	269	November 28, 2023
Identifying and getting right embeddings from the fine tuned BERT on domain specific data Intermediate	0	1331	September 8, 2021

Anomaly Detection / Out of Domain Detection with BERT

Related topics