Seq2SeqTrainer: enabled must be a bool (got NoneType)

Hey @navissivan and @ksoky,

  1. Don’t worry about the use_cache warning, it just means that we cannot use the k,v cache for the attention mechanism with gradient checkpointing. If you want to disable the warning, load the model and then set use_cache to False:
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
model.config.use_cache = False

The operation of the model is the same with and without cache - we just use cache to speed up decoding. Cache isn’t compatible when we use gradient checkpointing, so it’s disabled by the Trainer and a warning shown instead.

  1. It shouldn’t stay idle for that long - usually this happens when we set group_by_length=True but haven’t specified input_lengths in out prepare_dataset function. Have you modified the prepare_dataset function? Could you make sure the dataset that you pass to trainer has the input_lengths column?

  2. A progress bar should show - you need to set disable_tqdm=False in your training args.

You have a couple of options for running it in the background:

  • tmux: call tmux and then run Jupyter notebooks from the tmux shell:
tmux new -s mysession
jupyter lab

Then run your shell as normal. The process will continue running even when you close your shell. When you re-open your shell, you can reattach through:

tmux a -t mysession

Check out the docs for more info.

  • The other option is to export the ipynb notebook as a python script, and then run it using tmux or nohup:
    From File → Export Notebook As… in the Jupyter Lab menu select ‘Export Notebook to Executable Script’. This will give you a Python script to download. Then run it using tmux (as above) or nohup:
nohup python fine-tuning-whisper.py

You can open a new window to view the output:

vim nohup.out
  1. The table generates automatically by the Trainer if you perform evaluation over the course of training.

  2. It’s possible. The model checkpoint saved at step 1000 saves in the output directory under /home/sivan/whisper_base_fl_ch/checkpoint-1000
    You can load a model checkpoint from the saved checkpoint at step 1000 as follows:

model = WhisperForConditionalGeneration.from_pretrained("/home/sivan/whisper_base_fl_ch/checkpoint-1000")

You can then run a validation step:

from transformers import Seq2SeqTrainingArguments, Seq2SeqTrainer

training_args = Seq2SeqTrainingArguments(
    output_dir="/home/sivan/whisper_base_fl_ch/validation_step",
    do_train=False,
    do_eval=True,
    per_device_eval_batch_size=8,
    predict_with_generate=True,
    generation_max_length=225,
    save_strategy="no",
    report_to=["tensorboard"],
    push_to_hub=False,
    disable_tqdm=False,
)

trainer = Seq2SeqTrainer(
    args=training_args,
    model=model,
    eval_dataset=fleurs_ch["validation"],  # set to your val set
    data_collator=data_collator,
    compute_metrics=compute_metrics,
    tokenizer=processor.feature_extractor,
)

trainer.evaluate()

You can then repeat this for the checkpoints in directories checkpoint-2000, checkpoint-3000 and so on.

4 Likes