Is there a way to convert the checkpoint of a Bert model that was fine-tuned for token classification task (NER) using tensorflow Bert original script?
I am trying to use âtransformer-cli convertâ but it doesnât work except on pretrained models.
I want to share my fine-tuned models on models hub but it canât be converted.
Thank you for your reply. I tried this approach and got this error:
OSError: Error no file named [âpytorch_model.binâ, âtf_model.h5â, âmodel.ckpt.indexâ, âflax_model.msgpackâ] found in directory hub_models/ena_all_9010/anatomy/4e-05-90 or from_tf and from_flax set to False.
I was thinking of a way to convert the checkpoints of tensorflow fine-tuned model to pytorch as I did with the checkpoints for pretrained Bert models. Here are the files of one Bert fine-tuned model:
checkpoint
config.json
eval
eval_results.txt
eval.tf_record
graph.pbtxt
label2id.pkl
label_test.txt
model.ckpt-3000.data-00000-of-00001
model.ckpt-3000.index
model.ckpt-3000.meta
model.ckpt-4000.data-00000-of-00001
model.ckpt-4000.index
model.ckpt-4000.meta
model.ckpt-5000.data-00000-of-00001
model.ckpt-5000.index
model.ckpt-5000.meta
model.ckpt-6000.data-00000-of-00001
model.ckpt-6000.index
model.ckpt-6000.meta
model.ckpt-6316.data-00000-of-00001
model.ckpt-6316.index
model.ckpt-6316.meta
NER_result_conll.txt
predict.tf_record
token_test.txt
train.tf_record
I tried to convert the checkpoints using convert_bert_original_tf_checkpoint_to_pytorch.py but I got the following error:
raise ModuleAttributeError("â{}â object has no attribute â{}â".format(
torch.nn.modules.module.ModuleAttributeError: âBertForPreTrainingâ object has no attribute âbiasâ
I tried to fix it by replacing line 33 with model = BertForTokenClassification(config), then I got the following error:
transformers/src/transformers/models/bert/modeling_bert.py", line 156, in load_tf_weights_in_bert
if pointer.shape != array.shape:
AttributeError: âNoneTypeâ object has no attribute âshapeâ
investigating this issue more showed that pointer.shape != array.shape. Infact pointer = None for the following layers:
[âbertâ, âpoolerâ, âdenseâ, âbiasâ] , where array.shape = (768,)
[âbertâ, âpoolerâ, âdenseâ, âkernelâ], where array.shape = (768, 768)
[âoutput_biasâ], where array.shape = (7,)
[âoutput_weightsâ], where array.shape = (7, 768)
Iâd appreciate if someone can help me in figuring out how to fix the pointer for those layers.