How to Fine-tune Rostlab/prot_t5_xl_uniref50 Model for Sequence Generation

Dear Sir,

I’d like to fine-tune the pre-trained model Rostlab/prot_t5_xl_uniref50.

The dataset I plan to use for fine tuning looks like this (colon separated):

P R T <extra_id_0> I N S E Q W <extra_id_1> E N C E :  P R T K I N S E Q W H E N C E 
G M M  <extra_id_0>  <extra_id_1> K P H G : G M M V E K P H G
R H G L  <extra_id_0>  <extra_id_1> : R H G L Q F

...etc...

The first column is the input from a user and the output is on the right.
The output is simply the ‘active’ protein inferred from experiments.

Unfortunately, I have been unable to find a suitable example in the ProtTrans repository.

Additionally, I am curious if the Hugging Face script run_mlm.py can be utilized with the aforementioned pre-trained model.

I would truly appreciate your insights and recommendations on how to proceed with this task. Thank you in advance for your time and consideration.

Rgds,
littleworth