@patrickvonplaten @valhalla @stas
For many seq2seq models in the hub, num_beams
can be set meaningfully lower without hurting metrics.
For xsum, cnn, I tried a bunch of different values and decided these would be the better. (and don’t list if the default is good). The defaults are 8 for all pegasus, 6 for bart*xsum
and 4 for bart*cnn
. It’s not clear whether to change defaults from the published parameters, (it would be nice to save compute for pipelines and inference API, though) so I figured I’d just post this if people want faster inference. The speedups are substantial: between 20% and 100%. Tends to be easier on cnn_dailymail than xsum.
google/pegasus-cnn_dailymail: 4
sshleifer/distill-pegasus-cnn-16-4: 4
sshleifer/pegasus-cnn-ft-v2: 4
sshleifer/distilbart-cnn-12-3: 3
sshleifer/distilbart-cnn-12-6: 2
sshleifer/distilbart-cnn-6-6: 2
sshleifer/distill-pegasus-xsum-16-4: 4
sshleifer/distill-pegasus-xsum-12-12: 4
facebook/bart-large-cnn
Here are some rouge2 vs num_beams plots for different models
XSUM
CNN
Another note:
facebook/bart-large-xsum
: prefix=" " hurts rouge2 by .02. Should be removed. No impact on facebook/bart-large-cnn