Hi!
As Google Research team recently released the new T5 checkpoints (finetuned especially for Natural Questions) in TF format (non-h5).
Are there any convenient ways to convert these checkpoints to load with HF T5 model ?
Hi!
As Google Research team recently released the new T5 checkpoints (finetuned especially for Natural Questions) in TF format (non-h5).
Are there any convenient ways to convert these checkpoints to load with HF T5 model ?
Yes, great idea!
You could try to download the google cloud ckpt files then
python src/transformers/convert_t5_original_tf_checkpoint_to_pytorch.py --tf_checkpoint_path FIXME/model.ckpt-1014600 --pytorch_dump_path FIXME --config_file t5-base-config.json
transformers-cli upload FIXME
where FIXME is the new model name.
Post a github issue if that breaks!
Thanks Sam for suggestion & encouragement!
I had minimal time yesterday, so I just had a chance to minor modify your instructed command line to make sure it worked on loading weights into models (I tested on colab)
!gsutil -m cp -r gs://t5-data/pretrained_models/cbqa/t5.1.1.small_ssm_nq .
!python transformers/src/transformers/convert_t5_original_tf_checkpoint_to_pytorch.py --tf_checkpoint_path t5.1.1.small_ssm_nq/model.ckpt-1110000 --pytorch_dump_path t5.1.1.small_ssm_nq --config_file t5-small-config.json
UPDATED:
I just realized that the newly shared T5 weights have minor different config than the original as it based on T5.1.1 here .
Therefore, weights cannot convert directly due to architecture different between original T5 and T5.1.1
And the moment, I could not modify T5 config file on T5.1.1
UPDATED 2: Okay, I found an ongoing progress by Patrick in our own HF github about T5.1.1 :