Great! Think itโs very much feasible to implement conditional random fields on top of BERT - cool idea!
Regarding pretraining:
PreTraining BERT in English requires quite some time since the English dataset is so massive. Maybe just fine-tuning it makes sense in a first step ? Or further pre-training an already pre-trained English-BERT on some specific data?
Very much looking forward to this project