Dear Sir,
I’d like to fine-tune the pre-trained model Rostlab/prot_t5_xl_uniref50
.
The dataset I plan to use for fine tuning looks like this (colon separated):
P R T <extra_id_0> I N S E Q W <extra_id_1> E N C E : P R T K I N S E Q W H E N C E
G M M <extra_id_0> <extra_id_1> K P H G : G M M V E K P H G
R H G L <extra_id_0> <extra_id_1> : R H G L Q F
...etc...
The first column is the input from a user and the output is on the right.
The output is simply the ‘active’ protein inferred from experiments.
Unfortunately, I have been unable to find a suitable example in the ProtTrans repository.
Additionally, I am curious if the Hugging Face script run_mlm.py can be utilized with the aforementioned pre-trained model.
I would truly appreciate your insights and recommendations on how to proceed with this task. Thank you in advance for your time and consideration.
Rgds,
littleworth