MobileBERT too slow?

hoanchar · May 16, 2021, 3:43am

Is it just me or is mobileBERT is much slower than DistilBERT on Huggingface? When I train/fine-tune MobileBERT on GTX 1070, I get 3.8 it/sec. However, when I train DistilBERT on the same GPU, I get 15 it/sec. Am I missing something? The paper on MobileBERT states that MobileBERT should be faster.

fahimeh · October 6, 2021, 5:48am

I have exactly the same issue. Too slow. And in my case, it doesn’t converge at all. I increased the learning rate as was advised in the paper, but it didn’t help. Any idea?
My code is working perfectly with normal bert and distill_bert! But very bad performance with mobilebert.

hoanchar · October 6, 2021, 6:02am

Mine does converge but the training time takes longer than DistilBERT. I keep the learning rate at 2e-5. Try using without a lr scheduler?

Topic		Replies	Views
Albert MLM is slow 🤗Transformers	0	739	August 19, 2020
Finetuning BERT on TPU is very slow Intermediate	0	463	August 11, 2022
Slow speed when using a fine-tuned bert for prediction Beginners	0	2163	March 26, 2022
Advice to speed and performance 🤗Transformers	4	7214	December 7, 2020
Why is using my DistilBERT model for inference so slow? Intermediate	0	920	June 18, 2021

MobileBERT too slow?

Related topics