@stas or @sgugger would likely be able to answer this easily – thanks again for your comments on my previous query in December.
I was using the finetune_trainer.py
script back in December, and found that running a script like this…
python3 -m torch.distributed.launch --nproc_per_node=8 /workspace/rabbit-py/transformers/examples/seq2seq/finetune_trainer.py \
--learning_rate=1e-4 \
--do_train --do_eval --do_predict \
--evaluation_strategy steps \
--predict_with_generate \
--n_test 100 \
--fp16 \
--sortish_sampler \
--num_train_epochs 24 \
--data_dir "/workspace/rabbit-py/corpii/short_name_sequential_source" \
--model_name_or_path "google/pegasus-large" \
--output_dir "/workspace/rabbit-py/predictions/$RUN" \
--per_device_train_batch_size 2\
--per_device_eval_batch_size 2\
--logging_steps 768\
--gradient_accumulation_steps 32\
--task 'summarization'\
--max_target_length 12 \
--val_max_target_length 12 \
--test_max_target_length 12 \
--overwrite_output_dir \
--freeze_embeds \
--adafactor \
--run_name $RUN
"$@"
… would output checkpoint folders that looked like this:
The test_generations.txt
file was exactly 100 lines long, so I assume it corresponded to the --n_test 100
argument, although I can’t be sure, as I struggled for a while to understand the difference between predict
and eval
and test
, and eventually gave up as the terminology was just too confusing for me to understand.
That said, the test_generations.txt
file was generated and it was very useful.
I have now migrated to the new seq2seq
script, run_seq2seq.py
, from here: https://github.com/huggingface/transformers/tree/master/examples/seq2seq
I am successfully using this, with a script like this:
PREFIX=$(basename $BASH_SOURCE)
python3 /workspace/fw-py/transformers/examples/seq2seq/run_seq2seq.py \
--model_name_or_path '/workspace/fw-py/models_foreign/pegasus_large' \
--do_train \
--do_eval \
--do_predict \
--logging_steps 768 \
--evaluation_strategy steps \
--num_train_epochs 10 \
--task summarization \
--train_file "/workspace/fw-py/corpii/${PREFIX}/train.json" \
--validation_file "/workspace/fw-py/corpii/${PREFIX}/val.json" \
--test_file "/workspace/fw-py/corpii/${PREFIX}/test.json" \
--output_dir "/workspace/fw-py/predictions/${PREFIX}" \
--overwrite_output_dir \
--per_device_train_batch_size=2 \
--per_device_eval_batch_size=2 \
--predict_with_generate \
--text_column "question" \
--summary_column "known_answer"
Again, I don’t understand the difference between do_eval
, and do_predict
, and I am not sure what predict_with_generate
really means, and can’t find that documented clearly anywhere, so I am just using all of them.
This script is working, and is generating checkpoint folders that look like this:
… which is a great start, however, I am missing the critical file that I need, to see what my model outputs… this file I am missing is the test_generations.txt
file.
Does anyone know if it is still possible to generate these test generations?
I did consult the --help
command, and found …
--do_predict [DO_PREDICT] Whether to run predictions on the test set.
--predict_with_generate [PREDICT_WITH_GENERATE] Whether to use generate to calculate generative metrics (ROUGE, BLEU).
Which, although I don’t really understand what this means, does seem like something that could help create the test_generations.txt
, but that does not seem to be happening in my case.
Also, FYI, I am running the script again, trying just 3 epochs, and here is the first output from the console, which I think should show all of my arguments:
02/22/2021 19:45:54 - WARNING - __main__ - Process rank: -1, device: cuda:0, n_gpu: 1distributed training: False, 16-bits training: False
02/22/2021 19:45:54 - INFO - __main__ - Training/evaluation parameters Seq2SeqTrainingArguments(output_dir='/workspace/fw-py/predictions/translated_one', overwrite_output_dir=True, do_train=True, do_eval=True, do_predict=True, evaluation_strategy=<EvaluationStrategy.STEPS: 'steps'>, prediction_loss_only=False, per_device_train_batch_size=2, per_device_eval_batch_size=2, per_gpu_train_batch_size=None, per_gpu_eval_batch_size=None, gradient_accumulation_steps=1, eval_accumulation_steps=None, learning_rate=5e-05, weight_decay=0.0, adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-08, max_grad_norm=1.0, num_train_epochs=3.0, max_steps=-1, lr_scheduler_type=<SchedulerType.LINEAR: 'linear'>, warmup_ratio=0.0, warmup_steps=0, logging_dir='runs/Feb22_19-45-54_43a398359e63', logging_first_step=False, logging_steps=768, save_steps=500, save_total_limit=None, no_cuda=False, seed=42, fp16=False, fp16_opt_level='O1', fp16_backend='auto', local_rank=-1, tpu_num_cores=None, tpu_metrics_debug=False, debug=False, dataloader_drop_last=False, eval_steps=768, dataloader_num_workers=0, past_index=-1, run_name='/workspace/fw-py/predictions/translated_one', disable_tqdm=False, remove_unused_columns=True, label_names=None, load_best_model_at_end=False, metric_for_best_model=None, greater_is_better=None, ignore_data_skip=False, sharded_ddp=False, deepspeed=None, label_smoothing_factor=0.0, adafactor=False, group_by_length=False, report_to=['tensorboard', 'wandb'], ddp_find_unused_parameters=None, dataloader_pin_memory=True, skip_memory_metrics=False, sortish_sampler=False, predict_with_generate=True)
Thanks!