I’m trying to finetune a summarization model (bigbird-pegasus-large-bigpatent) on my own data.
I’m adapting this notebook, and I’m of course experiencing RAM issues even with Google Colab Premium.
If I set max_input_length = 1024 and max_target_length = 128 (which are the default values), I can get to do the training, the problem is that my sequences are much longer (I barely could reduce them to around 4000 tokens).
I’m thinking switching to AWS or GoogleCloud for more computational power, so my question is, any suggestion in which kind of GPU requirements should I be working with?
Any help would be appreciated