Seq2SeqTrainer, `push_to_hub` returns None

auhide · April 17, 2023, 9:15pm

Hi,

When I try to fine-tune the m5-small model with Seq2SeqTrainer I get this error:

   3550                 commit_message = f"Training in progress, epoch {int(self.state.epoch)}"
   3551             _, self.push_in_progress = self.repo.push_to_hub(
-> 3552                 commit_message=commit_message, blocking=False, auto_lfs_prune=True
   3553             )
   3554         finally:

TypeError: cannot unpack non-iterable NoneType object

Here is my code. I’ll start with the model & tokenizer initialization:

MODEL_ID = "google/mt5-small"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForSeq2SeqLM.from_pretrained(MODEL_ID)

And here are the Seq2SeqTrainingArguments and Seq2SeqTrainer:

MODEL_NAME = "mt5-bg-small"

EPOCHS = 15
L_RATE = 2e-4
W_DECAY = 0.01
TRAIN_BATCH_SIZE = 4
EVAL_BATCH_SIZE = 4

training_args = Seq2SeqTrainingArguments(
    output_dir=MODEL_NAME,
    evaluation_strategy="epoch",
    learning_rate=L_RATE,
    per_device_train_batch_size=TRAIN_BATCH_SIZE,
    per_device_eval_batch_size=EVAL_BATCH_SIZE,
    weight_decay=W_DECAY,
    save_total_limit=1,
    num_train_epochs=EPOCHS,
    # predict_with_generate=True,
    fp16=True,
    push_to_hub=True,
    report_to="none",
    
    # Not calculating the additional metrics - only the loss.
    prediction_loss_only=True
)

trainer = Seq2SeqTrainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    tokenizer=tokenizer,
    data_collator=data_collator,
)
trainer.train()

tokenizer.push_to_hub(MODEL_NAME)

The error occurs at the 2^{nd} saving step (in my case 1000^{th} step)
I am successfully logged into my account, using a WRITE Access Token. What might be the problem?

Please note - I am using a Kaggle Notebook with a GPU.

Thank you in advance,
Adam

zoebat20 · May 10, 2023, 9:37am

Hi @auhide ,
Were you able to figure out what the problem was? I am having the exact same issue when fine-tuning a GPT-J model.

auhide · May 10, 2023, 1:40pm

Hi @zoebat20,

I was not able to figure out what the problem was. I tried different things though.
Changing the base model helped. In my case, instead of using the google/mt5-small model (1.2 GB), I used t5-small or t5-base (242 MB and 898 MB respectively).

I guess the problem was the model size? Not sure why though.

Topic		Replies	Views
Seq2SeqTrainer: enabled must be a bool (got NoneType) 🤗Transformers	15	3953	December 5, 2022
"AttributeError: 'Seq2SeqTrainer' object has no attribute 'repo'" after running trainer.push_to_hub() Beginners	2	1742	July 13, 2021
Finetuning T5 on SQUADv2 with Seq2SeqTrainer fails 🤗Transformers	1	419	June 20, 2023
Extremely confusing or non-existent documentation about the Seq2Seq trainer Beginners	1	4446	December 16, 2021
Trainer.push_to_hub() wrong files Beginners	0	167	February 9, 2024

Seq2SeqTrainer, `push_to_hub` returns None

Related topics