How to load mT5 checkpoint (.ckpt)?


I loaded mT5 with
model = MT5ForConditionalGeneration.from_pretrained(“google/mt5-small”)
tokenizer = MT5Tokenizer.from_pretrained(“google/mt5-small”)

Then, I fined-tuned it with Pytorch Lightning and now have a checkpoint in the format .ckpt. However, now when I try to load it again with

model = AutoModelWithLMHead.from_pretrained("path/to/checkpoint.ckpt")
tokenizer = MT5Tokenizer.from_pretrained("google/mt5-small")

it fails …
I have this error code :

Traceback (most recent call last):
   line 8, in <module>
    model = AutoModelWithLMHead.from_pretrained("path/to/checkpoint.ckpt")
 "/transformers/models/auto/", line 930, in from_pretrained
    pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs
  "/transformers/models/auto/", line 360, in from_pretrained
    config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  "/transformers/", line 423, in get_config_dict
    config_dict = cls._dict_from_json_file(resolved_config_file)
  "/transformers/", line 506, in _dict_from_json_file
    text =
  File "/usr/lib/python3.6/", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

I have seen that it can be necessary to have a along with the checkpoint a config.json, vocab.json and merges.txt. The config.json is available on the google/mt5-small huggingface page so I downloaded it and added it to the folder with the checkpoint. However, the other files are not there. What are the steps for me to be able to load the .ckpt of mT5 ?


Hi @Skylixia,

Did you manage to solve the problem?

Any updates?Maybe try to convert the ckpt to pytorch_model.bin? but anybody know how to do this?