I’m trying to finetune a summarization model (bigbird-pegasus-large-bigpatent) on my own data.
Of course even with premium colab I’m having memory issues, so I tried to set gradient_checkpointing = True in the Seq2SeqTrainingArguments, which is supposed to save some memory altgough increasing the computation time.
The problem is that when starting the training this argument rises an error:
AttributeError: module ‘torch.utils’ has no attribute ‘checkpoint’
Has anyone experienced this same error?
I read in the Github discussion:
that in some other cases the same error was appearing but it was supposed to be solved here: