Reproduce results on CNN/DailyMail Dataset

Currently, my aim is to finetune the implemented Pegasus on CNN/DailyMail dataset from the ‘google/pegasus-large’ checkpoint. However, I was unable to achieve claimed numbers (Pegasus: replication and distillation results · Issue #6844 · huggingface/transformers · GitHub). My results are ROUGE-1: 43.7 and ROUGE-L: 40.6. My assumption is that I need to modify some sort of hyperparameters.

I would be grateful if you could give me any comment or advice.

P.S: these are hyperparameters I have tried

max_input_length=1024,
max_output_length=128,
freeze_encoder=False,
freeze_embeds=True,
learning_rate=1e-4 (1e-3),
weight_decay=0.0,
adam_epsilon=1e-8,
warmup_steps=10000,
gradient_accumulation_steps=8,
fp_16=False,
opt_level=‘O1’,
max_grad_norm=1.0,
num_train_epochs=20 (10),
train_batch_size=4,
eval_batch_size=16