How to use Elastic Weight Consolidation for domain adaptation with HuggingFace?

I am trying to do domain adaptation from a pretrained model for classification (BERT). I tried continuous training which has produced a decrease in quality in the first domain. I read about catastrophic forgetting and as a way to solve it, people have recommended EWC Elastic Weight Consolidation. I haven’t found a way to implement this using the Hugging Face transformers.
Do you have any idea on how to do it?
Any help is highly appretiated.

1 Like