Hello,
I’m trying to use one of the TinyBERT models produced by HUAWEI (link) and it seems there is a field missing in the config.json
file:
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("huawei-noah/TinyBERT_General_4L_312D")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/lewtun/git/transformers/src/transformers/models/auto/tokenization_auto.py", line 345, in from_pretrained
config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/Users/lewtun/git/transformers/src/transformers/models/auto/configuration_auto.py", line 360, in from_pretrained
raise ValueError(
ValueError: Unrecognized model in huawei-noah/TinyBERT_General_4L_312D. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: retribert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta, flaubert, fsmt, squeezebert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas
Looking at the config.json file (link) it seems like it should be an easy enough fix to add something like "model_type": "tinybert"
so my question is how does one go about patching a fix in a community provided model? Do I raise an issue on the Transformers repo or somewhere else?