Albert giving OOM compared to Bert

hfawaz · December 10, 2020, 1:35pm

Hey,
I am tried running on GTX 1080 (10GB) bert-base-uncased with sucess on IMDB dataset with a batch-size equal to 16 and sequence length equal to 128.
Running albert-base-v2 with the same sequence length and same batch size is giving me Out-of-memory issues.
Is the reason that Bert is uncased ? Is albert-base-v2 case or uncased ?

Topic		Replies	Views
Trainer error for "albert-base-v2" due to batch size mismatch 🤗Transformers	1	742	April 11, 2023
TF bert-base-uncased reserves large memory space 🤗Transformers	1	854	June 24, 2022
Bert-large strange performance in document classification (auc~0.5) 🤗Transformers	0	413	November 4, 2021
Tokenizer taking lot of memory 🤗Transformers	3	3466	April 16, 2023
Albert Pre-training with Batch size 8 is throwing OOM 🤗Transformers	0	369	January 12, 2022

Albert giving OOM compared to Bert

Related topics