Missing config.json file after AutoTraining

Detailed Problem Summary

Context:

  • Environment: Google Colab (Pro Version using a V100) for training.
  • Tool: Utilizing Hugging Face AutoTrain for fine-tuning a language model.

Sequence of Events:

  1. Initial Training:
  • Successfully trained a model using AutoTrain.
  • Process seemingly completed without errors, resulting in several output files.
  1. Missing config.json:
  • Despite successful training, noticed that the config.json file was not generated.
  • Without config.json, the trained model cannot be loaded for inference or further training.
  1. Manual Configuration:
  • Created a config.json manually based on a the ā€˜base modelā€™ used for fine-tuning (NousResearch/Llama-2-7b-chat-hf) plus additional training and adapter parameters derived from the fine-tuned modelā€™s files AutoTrain uploads to the HF repository.
  • Uploaded this config.json to the Hugging Face repository where the model resides.
  1. Upload to Repository:
  • Uploaded all relevant files, including pytorch_model.bin, adapter_config.json, adapter_model.bin, and others, to a Hugging Face repository named Kabatubare/meta_douglas_2.
  1. Model Loading Error:
  • Attempted to load the model and encountered the following error:

vbnetCopy code

OSError: Kabatubare/meta_douglas_2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt, or flax_model.msgpack.
  1. File Size Anomaly:
  • Noticed that the size of the uploaded pytorch_model.bin is only 888 Bytes, which is far smaller than what is typical for such files.

Repository File Structure:

  • adapter_config.json
  • adapter_model.bin
  • added_tokens.json
  • config.json (manually added)
  • pytorch_model.bin (888 Bytes, suspected to be incorrect or incomplete)
  • Tokenizer files (tokenizer.json, tokenizer.model, etc.)
  • Training parameters (training_args.bin, training_params.json)

Specific Questions for the Hugging Face Community:

  1. Configuration File: Why is a config.json not generated by AutoTrain by default? Is there a specific setting or flag that needs to be enabled to output this file?
  2. File Size Issue:
  • What could cause pytorch_model.bin to be so small (888 Bytes)?
  • Could this be a symptom of an incomplete or failed save operation?
  1. Manual Configuration:
  • Are there standard procedures or checks to verify that a manually created config.json is accurate?
  • Are there tools to validate the config.json against the actual PyTorch model file?
  1. Error Resolution:
  • How to resolve the OSError encountered while loading the model?
  • Are there specific requirements for the directory structure when loading models from a Hugging Face repository?
  1. Model Integrity:
  • Given the missing config.json and the small size of pytorch_model.bin, are there steps to verify the integrity of the trained model?
1 Like

Hi, were you able to ever solve your problem? I have the same issue after using autotrain and there is no config.json file

1 Like

I think your adapter_model.bin is your pytorch_model.bin, just need to rename it maybe.

1 Like

I have the exact same issue! Any update on this ?

1 Like

For those still wondering, I found this answer helpful:

In other words, by training with PeFT, you havenā€™t saved the whole model, but only the parameters that were updated by LoRA. This is called the ā€œadapter modelā€. In order to run the inference one the whole fine-tuned model, you need to merge the adapter model with the original base model. Here is a guide to do it on your local machine.

However, if you want to run the inference API, you can use a library that was made on purpose, called peft. In order to activate it, you need to specify it on the README of your uploaded model.

Currently there is a problem: if the base model needs an authentication token (e.g. LLama2), it wonā€™t work: this is because the Peft inference API struggles to use your token to fetch the base model. Maybe I am doing something wrong, but in the end I didnā€™t manage to make it work. If someone finds a solution I would love to know.

Disclaimer: I am not an expert, I am a beginner. Just trying to save some other people the hassle I went through to find it out '-_-

1 Like

Hi,

[ISSUE]: {model_name} does not appear to have a file named _config.json

Any updates from the HuggingFace team? @sgugger

Iā€™m having the same issue.
Step 1: (using HF Spaces AutoTrain GUI)
I fine-tuned mistralai/Mixtral-8x7B-Instruct-v0.1 on my data using SFT.

Step2: (using HF Spaces AutoTrain GUI)
Iā€™m trying to further fine-tune my SFT fine-tuned model on my data using DPO now.

Step3: (using HF Spaces AutoTrain GUI)
In the APP pane I upload my train.csv and click on Start Training

Step 4: I get back the following error
ā€œā€"
:x: ERROR | 2024-03-13 09:41:28 | autotrain.trainers.common:wrapper:92 - xxxxx/autotrain-xxxxx-xxxxx does not appear to have a file named config.json. Checkout ā€˜https://huggingface.co/xxxxx-xxxxx/autotrain-xxxxx-xxxxx/mainā€™ for available files.
ā€œā€"

Doesnā€™t the config.json file get generated automatically when using AutoTrain? Should I create it myself? Iā€™m not sure how. Is there a specific documentation page on the issue?

@sgugger could you please assist?

Many thanks :pray:t2:

Iā€™m getting the same error, do you fix that problem @nadav-sellence?

Thanks! That worked.