Fine-tuning BigBirdPegasus

ArnauC · October 13, 2021, 4:40pm

Hi all,

I’m trying to finetune a summarization model (bigbird-pegasus-large-bigpatent) on my own data.
I’m adapting this notebook, and I’m of course experiencing RAM issues even with Google Colab Premium.

If I set max_input_length = 1024 and max_target_length = 128 (which are the default values), I can get to do the training, the problem is that my sequences are much longer (I barely could reduce them to around 4000 tokens).

I’m thinking switching to AWS or GoogleCloud for more computational power, so my question is, any suggestion in which kind of GPU requirements should I be working with?

Any help would be appreciated

Topic		Replies	Views
fine-tune Pegasus with xsum using Colab but generation results have no difference 🤗Transformers	0	991	March 8, 2021
Pegasus max_token_len restriction 🤗Transformers	0	367	May 25, 2022
Bigbird pretraining Beginners	3	885	March 16, 2022
Pegasus finetuning, should we always start with pegasus-large? Beginners	5	1673	May 3, 2024
Hyperparameter tuning practical guide? Beginners	1	489	October 6, 2021

Fine-tuning BigBirdPegasus

Related topics