Albert giving OOM compared to Bert

I am tried running on GTX 1080 (10GB) bert-base-uncased with sucess on IMDB dataset with a batch-size equal to 16 and sequence length equal to 128.
Running albert-base-v2 with the same sequence length and same batch size is giving me Out-of-memory issues.
Is the reason that Bert is uncased ? Is albert-base-v2 case or uncased ?