Hi Team
Can I train a model from scratch without using a pre-trained model(MLM)? What results can I expect?
I have corpus of 50000 document image data and I am trying to train a multi-model token classification model. (Lilt, LayoutLM)
- Should I pre-train a MLM model and finetune with same data corpus (or)
- should I directly train model from scratch with Token classification head attached.
I cannot download a pre-trained model because of organization policy, So I want to get good results with train from scratch approach.