Of course. Thank you for looking into it.
For the tfmr3 finetune.py:
python finetune.py \
--learning_rate=1e-4 \
--do_train \
--do_predict \
--n_val 1000 \
--num_train_epochs 1 \
--val_check_interval 0.25 \
--max_source_length 512 --max_target_length 56 \
--freeze_embeds --label_smoothing 0.1 --adafactor --task summarization_xsum \
--model_name_or_path "tuner007/pegasus_paraphrase" \
--data_dir {data_dir} \
--output_dir {output_dir} \
--gpus 4 \
--overwrite_output_dir
For the new run_summarization.py
:
python tfmr4/run_summarization.py \
--model_name_or_path "tuner007/pegasus_paraphrase" \
--cache_dir $CACHE_DIR \
--train_file $TRAIN_FILE \
--validation_file $VAL_FILE \
--test_file $TEST_FILE \
--output_dir $MODEL_OUTPUT_DIR \
--learning_rate=1e-4 \
--num_train_epochs=1 \
--per_device_train_batch_size=32 \
--per_device_eval_batch_size=32 \
--do_train \
--do_predict \
--max_source_length 512 \
--max_target_length 56 \
--label_smoothing 0.1 \
--adafactor \
--overwrite_output_dir
There were a couple configs that exists in finetune.py
but no longer in run_summarization.py
. I will also look into all the possible configurations for the two scripts and spot any difference.