Huge difference in speed when finetuning summarization with different scripts

yilunzhang · August 13, 2021, 2:52pm

Of course. Thank you for looking into it.

For the tfmr3 finetune.py:

python finetune.py \
    --learning_rate=1e-4 \
    --do_train \
    --do_predict \
    --n_val 1000 \
    --num_train_epochs 1 \
    --val_check_interval 0.25 \
    --max_source_length 512 --max_target_length 56 \
    --freeze_embeds --label_smoothing 0.1 --adafactor --task summarization_xsum \
    --model_name_or_path "tuner007/pegasus_paraphrase" \
    --data_dir {data_dir} \
    --output_dir {output_dir} \
    --gpus 4 \
    --overwrite_output_dir

For the new run_summarization.py:

python tfmr4/run_summarization.py \
    --model_name_or_path "tuner007/pegasus_paraphrase" \
    --cache_dir $CACHE_DIR \
    --train_file $TRAIN_FILE \
    --validation_file $VAL_FILE \
    --test_file $TEST_FILE \
    --output_dir $MODEL_OUTPUT_DIR \
    --learning_rate=1e-4 \
    --num_train_epochs=1 \
    --per_device_train_batch_size=32 \
    --per_device_eval_batch_size=32 \
    --do_train \
    --do_predict \
    --max_source_length 512 \
    --max_target_length 56 \
    --label_smoothing 0.1 \
    --adafactor \
    --overwrite_output_dir

There were a couple configs that exists in finetune.py but no longer in run_summarization.py. I will also look into all the possible configurations for the two scripts and spot any difference.

Topic		Replies	Views
Finetuning Pegasus for summarization task 🤗Transformers	3	1050	October 14, 2020
Fine-tuning Pegasus Models	33	10145	October 14, 2021
Speed up the prediction in transformers models 🤗Transformers	0	666	November 23, 2021
fine-tune Pegasus with xsum using Colab but generation results have no difference 🤗Transformers	0	993	March 8, 2021
Improve the performance of model prediction of transformers model 🤗Transformers	3	2643	November 24, 2021

Huge difference in speed when finetuning summarization with different scripts

Related topics