I have a question regarding using the transformers library to pretrain ALBERT. I have been using RoBERTa for some while now which I have pretrained with custom data with run_mlm.py from the examples directory which is fine since RoBERTa only uses MLM loss when pretraining. However, ALBERT adds sentence order prediction (SOP) which is not implemented in run_mlm.py. Are there any examples on implementing SOP which I have overlooked in the transformers library? If not, anyone care to share a code example of how to implement this? If not I would have to dive a bit deeper to implement from scratch but I’m hoping I won’t have to
I found a DataCollatorForSOP which appears to solve this task: transformers/data_collator.py at d9c62047a8d75e18d2849d345ab3394875a712ef · huggingface/transformers · GitHub
Would be great to have a seperate file in examples which implements the pretraining for ALBERT.