New seq2seq tool: search hparam space with run_eval.py

stas · September 16, 2020, 8:21pm

FYI, there is a new tool available to you - you can now search the hparam space with run_eval.py.

It’s called run_eval_search.py

It uses the same arguments as run_eval.py, but allows you to parametrize the hparams, so in addition to the normal args you can pass:

--search="num_beams=8:11:15 length_penalty=0.9:1.0:1.1 early_stopping=true:false"

and it’ll search all the possible combinations and at the end print a table of results sorted by the scores of the task. e.g.:


bleu  | num_beams | length_penalty | early_stopping
----- | --------- | -------------- | --------------
41.35 |        11 |            1.1 |              0
41.33 |        11 |            1.0 |              0
41.33 |        11 |            1.1 |              1
41.32 |        15 |            1.1 |              0
41.29 |        15 |            1.1 |              1
41.28 |        15 |            1.0 |              0
41.25 |         8 |            1.1 |              0
41.24 |        11 |            1.0 |              1
41.23 |        11 |            0.9 |              0
41.20 |        15 |            1.0 |              1
41.18 |         8 |            1.0 |              0

You can have one or more params searched.

Here is an example of a full command:

PYTHONPATH="src:examples/seq2seq" python examples/seq2seq/run_eval_search.py \
facebook/wmt19-$PAIR $DATA_DIR/val.source $SAVE_DIR/test_translations.txt \
--reference_path $DATA_DIR/val.target --score_path $SAVE_DIR/test_bleu.json \
--bs $BS --task translation \
--search="num_beams=1:5 length_penalty=0.9:1.1 early_stopping=true:false"

If you encounter any issues please let me know.

It’s documented here: https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md#run_eval-tips-and-tricks. @sshleifer and I added some more goodies in run_eval.py - you will find them all documented at that url.

Enjoy.

p.s. edited to remove things that are going to change based on Sam’s comment below.

sshleifer · September 16, 2020, 10:07pm

Great work!

There are only two possible sets of keys to get from run_eval.py since
score_fn = calculate_bleu_score if "translation" in args.task else calculate_rouge

You shouldn’t hard code the possible tasks any more than that IMO.

stas · September 16, 2020, 10:25pm

ah, thank you for clarifying that - I will adjust it to follow the same logic.

valhalla · September 17, 2020, 6:04am

This is awesome ! Thanks @stas

BramVanroy · September 17, 2020, 11:31am

I haven’t checked the code, I’m on mobile now. But are there many scenarios where we actually need to do hyperparameters search on the evaluation/inference side? In addition, does this use the optuna implementation that is being worked on in the trainer by @sgugger , or is it a separate implementation?

valhalla · September 17, 2020, 12:11pm

When you train a seq2seq model on new summ or translation dataset or other seq2seq task and want to decide how many beams to use, should use length penalty or not, what should be the max seq length, what should be the no_repeat_ngram_size etc, all of these parameter affect the metrics , so this tool helps to make those decisions,

It does not use optuna, it just uses itetools.product to enumerate the different combinations and evaluate on them

Topic		Replies	Views
Hyper params search for model config 🤗Transformers	0	168	February 22, 2024
Looking for hyperparameter tuning advices Beginners	0	932	November 3, 2022
Parameters for evaluation loop of a Seq2SeqTrainer model Intermediate	0	1166	November 26, 2021
Finetuning mt5 for question answering using run_qa_seq2seq Beginners	0	136	February 15, 2024
How To Output "test_generations.txt" with run_seq2seq.py? Beginners	5	744	March 9, 2021

New seq2seq tool: search hparam space with run_eval.py

Related topics