Hello ! I hope you’re having a great day !
I tried to use transformers-cli to transform a tf checkpoint to pytorch albert model but I’m meeting unexpected errors.
Here’s my error :
Skipping albert/embeddings/layer_normalization/beta
Traceback (most recent call last):
File "/path/transformers-cli", line 11, in <module>
sys.exit(main())
File "/path/transformers/commands/transformers_cli.py", line 55, in main
service.run()
File "/path/transformers/commands/convert.py", line 94, in run
convert_tf_checkpoint_to_pytorch(self._tf_checkpoint, self._config, self._pytorch_dump_output)
File "/path/transformers/models/albert/convert_albert_original_tf_checkpoint_to_pytorch.py", line 36, in convert_tf_checkpoint_to_pytorch
load_tf_weights_in_albert(model, config, tf_checkpoint_path)
File "/path/transformers/models/albert/modeling_albert.py", line 164, in load_tf_weights_in_albert
pointer = getattr(pointer, "bias")
File "/path/torch/nn/modules/module.py", line 1207, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'AlbertEmbeddings' object has no attribute 'bias'
From my understanding, the error comes from a part of load_tf_weights_in_albert (at transformers/modeling_albert.py at main · huggingface/transformers · GitHub) and more precisely those lines :
if scope_names[0] == "kernel" or scope_names[0] == "gamma":
pointer = getattr(pointer, "weight")
elif scope_names[0] == "output_bias" or scope_names[0] == "beta":
pointer = getattr(pointer, "bias")
What I guess is happening:
bias or weight attributes are searched when the tf layer of checkpoint contains either a gamma or beta in it’s name. Those attributes can’t be found. For example, the first object where a bias or weight attribute is searched is AlbertEmbeddings because the layer name is albert/embeddings/layer_normalization/beta. But there’s no attribute with this name inside AlbertEmbeddings.
Instead, AlbertEmbeddings possess a LayerNorm object which has bias (beta for layernorm) and weight (gamma for layernorm). So I guess that what should be done is actually to retrieve the bias and weight of the corresponding LayerNorm ?
Though I only have an error for AlbertEmbeddings (because the code stop running there), other objet may be subject to this error. AlbertModel has other objets with LayerNorm as attributes (AlbertLayer, AlberAttention, …). Those LayerNorm have differents names for each. And sometimes they’re present in a recursive way (AlbertLayerGroup which has multiple AlbertLayer object with one LayerNorm each).
I’m not sure of my interpretation…
Is there any way to fix this error ? Or anoter way to obtain a correct pytorch model from a tf checkpoint for an AlbertModel ?
Thanks for your attention and any help you can provide