Hello.
Given that it works reasonably well in practice, I think the approach is correct. There are many successor models to BERT, so it should be possible to improve accuracy using those.
Another approach that can be taken when there is little labeled data is something called Positive Unlabeled Learning…
Another common approach is to use commercial AI to create a training dataset using your own data. This is almost always effective if the budget allows. However, in this case, there is already a considerable amount of data available, so it may be sufficient to process the data using Python.
Resources: