Training from scratch without any pre-trained MLM model

Hi Team

Can I train a model from scratch without using a pre-trained model(MLM)? What results can I expect?

I have corpus of 50000 document image data and I am trying to train a multi-model token classification model. (Lilt, LayoutLM)

  1. Should I pre-train a MLM model and finetune with same data corpus (or)
  2. should I directly train model from scratch with Token classification head attached.

I cannot download a pre-trained model because of organization policy, So I want to get good results with train from scratch approach.