Decoder only fine-tuning enough for UMT5

Looking into fine-tuning UMT5 (encoder-decoder) model for a translation task.

Has anyone explored end task’s performance difference when finetuning such (unsupervised trained encoder-decoder) model in

  • whole model finetuning setting
  • decoder only finetuning