In my experience, LongFormer and BigBird require a lot of GPU memory. I tried using these on a 14GB GPU, but I was limited to batch_size=1
, which took for ever to train and yielded rather poor results.
In my experience, LongFormer and BigBird require a lot of GPU memory. I tried using these on a 14GB GPU, but I was limited to batch_size=1
, which took for ever to train and yielded rather poor results.