How to create a config.json after saving a model

Hi, I am trying to convert my model to onnx format with the help of this notebook
I got error , since config.json does not exist.
My model is a custom model with extra layers, similar to this,

Now how can I create a config.json file for this?

2 Likes

Normally, if you save your model using the .save_pretrained() method, it will save both the model weights and a config.json file in the specified directory.

1 Like

Yes, but this is a custom model that I have saved in pytorch style, since it consists of additional layers, is there anyway to generate confg.json file?

You need to subclass it to have the save_pretrained methods available. So instead of

class Mean_Pooling_Model(nn.Module):

use

from transformers.modeling_utils import PreTrainedModel
class Mean_Pooling_Model(PreTrainedModel):

It will add extra functionality on top of nn.Module.

Thank you I will try this!

Is it possible to generate the configuration file for already trained model , i.e weights stored in normal pytorch model.bin

1 Like

Use model.config.to_json() method to generate config.json

Did you end up finding a solution to getting a config.json from an already trained model? :slight_smile: I’m currently struggling with the same problem :frowning:

Nope, I was not able to find a proper solution, I ended up writing the config.json manually

1 Like

You should be able to just call

model.config.to_json_file("config.json")

cc @seanbenhur

3 Likes

That only works for models that are transformer native and not nn.Module/pytorch native, sadly.

2 Likes

What is your use-case that you are using Transformers but not Transformers models? If you want to use the HF Trainer alongside with your own PyTorch model, I recommended to subclass the relevant classes, similar to PretrainedModel

And to use your own PretrainedConfig alongside of it.

1 Like

I have a similar issue where I have my model’s (nn.module) weights and I want to convert it to be huggingface compatible model so that I can use hugging face models (as .generate). From the discussions I can see that I either have to retrain again while changing (nn.module to PreTrained) or to define my config.json file. If I wrote my config.json file what should I do next to load my torch model as huggingface one?

I am not sure from the discussion above, what the solution is. Can someone post their working example please?

1 Like

I am not sure whether this functionality exists at this moment.

Folks,
I am trying out the code at GitHub - aws-samples/aws-inferentia-huggingface-workshop: CMP314 Optimizing NLP models with Amazon EC2 Inf1 instances in Amazon Sagemaker .

For inferentia instance, I get a config.json not found in the cloudwatch logs and inference fails. The config.json file is present in the traced model tar.gz file for inferentia.
Please help me resolve this.
Thanks
Ajay

P.S the log message — W-9002-model_1-stdout MODEL_LOG - OSError: file /home/model-server/tmp/models/cb9491669c1f44c1a0763e8a62d9368e/config.json not found

Was anyone able to resolve this issue ,i.e., converting a custom nn.Module to a huggingface compatible version?

Did you find a solution/workaround for this issue?

@Hosna You can push the config via : customModel.pretrained_model.config.push_to_hub(repo_id)

SO this worked for me , i

imported

from transformers.modeling_utils import PreTrainedModel ,PretrainedConfig

and then in my class

class TransformerLanguageModel(PreTrainedModel):
    def __init__(self, config):
        super(TransformerLanguageModel, self).__init__(config)
        self.token_embedding_table = nn.Embedding(config.vocab_size, config.hidden_size)
        self.position_embedding_table = nn.Embedding(config.block_size, config.hidden_size)
        self.transformer = nn.Transformer(
            d_model=config.hidden_size,
            nhead=config.num_attention_heads,
            num_encoder_layers=config.num_hidden_layers,
            num_decoder_layers=config.num_hidden_layers,
            dim_feedforward=4 * config.hidden_size,
            dropout=config.hidden_dropout_prob,
            activation='gelu'
        )
        self.ln1 = nn.LayerNorm(config.hidden_size)
        self.ln2 = nn.LayerNorm(config.hidden_size)
        self.lm_head = nn.Linear(config.hidden_size, config.vocab_size)

after that you have to create a config variable

Create a configuration object

config = PretrainedConfig(
    vocab_size=1000,  # Specify your vocabulary size
    hidden_size=n_embd,  # Use your embedding dimension
    num_attention_heads=n_head,
    num_hidden_layers=n_layer,
    hidden_dropout_prob=dropout,
    block_size=block_size
)

model = TransformerLanguageModel(config)
model.to(device)


now you can save the model 
model.save_pretrained('./path_to_model/')