Problems using AutoModel.from_pretrained on custom model

Hi,

I’m working on a custom casual language model that doesn’t extend any huggingface models.

class BigBrainConfig(PretrainedConfig):
    model_type = 'big-brain-lm'
...
class BigBrainLanguageModel(PreTrainedModel):
    config_class = BigBrainConfig
    base_model_prefix = 'big-brain-lm'
...

I used the following code to upload the model and register it for AutoModel.

login(token=args.token)

BigBrainConfig.register_for_auto_class()
BigBrainLanguageModel.register_for_auto_class('AutoModel')

model = BigBrainLanguageModel(BigBrainConfig())
model.push_to_hub('big-brain-lm')

logout()

When using the following code in my training script, I get the following error.

model = AutoModel.from_pretrained('Fal7acy/big-brain-lm', trust_remote_code=True)
Traceback (most recent call last):
  File "C:\Users\parof\PycharmProjects\BigBrainAI\src\model\training.py", line 88, in <module>
    model = AutoModel.from_pretrained('Fal7acy/big-brain-lm', trust_remote_code=True)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\parof\PycharmProjects\Environments\BigBrainAI\Lib\site-packages\transformers\models\auto\auto_factory.py", line 487, in from_pretrained
    cls.register(config.__class__, model_class, exist_ok=True)
  File "C:\Users\parof\PycharmProjects\Environments\BigBrainAI\Lib\site-packages\transformers\models\auto\auto_factory.py", line 513, in register
    raise ValueError(
ValueError: The model class you are passing has a `config_class` attribute that is not consistent with the config class you passed (model has <class 'config.BigBrainConfig'> and you passed <class 'transformers_modules.Fal7acy.big-brain-lm.fc99f449c2d4cdb74e902662267cec031eeeea5f.config.BigBrainConfig'>. Fix one of those so they match!

I am probably doing something wrong, but I feel that I have followed the share a custom model guide.

Thank you in advance!

I have not used this yet - but does the very last section in the tutorial apply in your case?

That is registering models without using the _for_auto_class() suffix.

That is for when the code is in a different repo, my code is in the same repo the model is saved in. The config file for my model has the right section in it.

"auto_map": {
    "AutoConfig": "config.BigBrainConfig",
    "AutoModel": "language.BigBrainLanguageModel"
  }

I think the problem might be that the config class gets saved somewhere on the backend so the class path is not the same when loading it.

(model has <class 'config.BigBrainConfig'> and you passed <class 'transformers_modules.Fal7acy.big-brain-lm.fc99f449c2d4cdb74e902662267cec031eeeea5f.config.BigBrainConfig'>

Perhaps its a bug?

To keep going, try overriding the config class by adding it in to the from_pretrained call, that should force the config to be what you pass it instead of it trying to read it from the model file.

config = BigBrainConfig()
model = AutoModel.from_pretrained('Fal7acy/big-brain-lm', trust_remote_code=True, config=config)

The AutoConfig does read the setting in the config file, but based on the code, it looks like the actual cache is loaded from the code which will end up pointing to the cached or remote class.