Is there a possibility to use MLM modelling for pretraining for autocasualLM model like MPT or falcon? If yes, Has someone tried it? Are there any relevant code bases which I can use?

