Pretraining ALBERT


I have a question regarding using the transformers library to pretrain ALBERT. I have been using RoBERTa for some while now which I have pretrained with custom data with from the examples directory which is fine since RoBERTa only uses MLM loss when pretraining. However, ALBERT adds sentence order prediction (SOP) which is not implemented in Are there any examples on implementing SOP which I have overlooked in the transformers library? If not, anyone care to share a code example of how to implement this? If not I would have to dive a bit deeper to implement from scratch but I’m hoping I won’t have to :slight_smile:


I found a DataCollatorForSOP which appears to solve this task: transformers/ at d9c62047a8d75e18d2849d345ab3394875a712ef · huggingface/transformers · GitHub

Would be great to have a seperate file in examples which implements the pretraining for ALBERT.


May I ask how you used that DataCollator? I’m also trying to pretrain ALBERT but I am also facing these difficulties.

I have pretrained an ALBERT model last year. You do not need a special DataCollator for SOP. Just use the DataCollatorForLanguageModeling.

1 Like