Pretraining ALBERT

Hello!

I have a question regarding using the transformers library to pretrain ALBERT. I have been using RoBERTa for some while now which I have pretrained with custom data with run_mlm.py from the examples directory which is fine since RoBERTa only uses MLM loss when pretraining. However, ALBERT adds sentence order prediction (SOP) which is not implemented in run_mlm.py. Are there any examples on implementing SOP which I have overlooked in the transformers library? If not, anyone care to share a code example of how to implement this? If not I would have to dive a bit deeper to implement from scratch but I’m hoping I won’t have to :slight_smile:

Thanks!

Edit:
I found a DataCollatorForSOP which appears to solve this task: transformers/data_collator.py at d9c62047a8d75e18d2849d345ab3394875a712ef · huggingface/transformers · GitHub

Would be great to have a seperate file in examples which implements the pretraining for ALBERT.

2 Likes

May I ask how you used that DataCollator? I’m also trying to pretrain ALBERT but I am also facing these difficulties.

I have pretrained an ALBERT model last year. You do not need a special DataCollator for SOP. Just use the DataCollatorForLanguageModeling.

1 Like