Missing config.json file after AutoTraining

Detailed Problem Summary

Context:

  • Environment: Google Colab (Pro Version using a V100) for training.
  • Tool: Utilizing Hugging Face AutoTrain for fine-tuning a language model.

Sequence of Events:

  1. Initial Training:
  • Successfully trained a model using AutoTrain.
  • Process seemingly completed without errors, resulting in several output files.
  1. Missing config.json:
  • Despite successful training, noticed that the config.json file was not generated.
  • Without config.json, the trained model cannot be loaded for inference or further training.
  1. Manual Configuration:
  • Created a config.json manually based on a the ‘base model’ used for fine-tuning (NousResearch/Llama-2-7b-chat-hf) plus additional training and adapter parameters derived from the fine-tuned model’s files AutoTrain uploads to the HF repository.
  • Uploaded this config.json to the Hugging Face repository where the model resides.
  1. Upload to Repository:
  • Uploaded all relevant files, including pytorch_model.bin, adapter_config.json, adapter_model.bin, and others, to a Hugging Face repository named Kabatubare/meta_douglas_2.
  1. Model Loading Error:
  • Attempted to load the model and encountered the following error:

vbnetCopy code

OSError: Kabatubare/meta_douglas_2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt, or flax_model.msgpack.
  1. File Size Anomaly:
  • Noticed that the size of the uploaded pytorch_model.bin is only 888 Bytes, which is far smaller than what is typical for such files.

Repository File Structure:

  • adapter_config.json
  • adapter_model.bin
  • added_tokens.json
  • config.json (manually added)
  • pytorch_model.bin (888 Bytes, suspected to be incorrect or incomplete)
  • Tokenizer files (tokenizer.json, tokenizer.model, etc.)
  • Training parameters (training_args.bin, training_params.json)

Specific Questions for the Hugging Face Community:

  1. Configuration File: Why is a config.json not generated by AutoTrain by default? Is there a specific setting or flag that needs to be enabled to output this file?
  2. File Size Issue:
  • What could cause pytorch_model.bin to be so small (888 Bytes)?
  • Could this be a symptom of an incomplete or failed save operation?
  1. Manual Configuration:
  • Are there standard procedures or checks to verify that a manually created config.json is accurate?
  • Are there tools to validate the config.json against the actual PyTorch model file?
  1. Error Resolution:
  • How to resolve the OSError encountered while loading the model?
  • Are there specific requirements for the directory structure when loading models from a Hugging Face repository?
  1. Model Integrity:
  • Given the missing config.json and the small size of pytorch_model.bin, are there steps to verify the integrity of the trained model?
1 Like

Hi, were you able to ever solve your problem? I have the same issue after using autotrain and there is no config.json file

1 Like

I think your adapter_model.bin is your pytorch_model.bin, just need to rename it maybe.

1 Like

I have the exact same issue! Any update on this ?

1 Like

For those still wondering, I found this answer helpful:

In other words, by training with PeFT, you haven’t saved the whole model, but only the parameters that were updated by LoRA. This is called the “adapter model”. In order to run the inference one the whole fine-tuned model, you need to merge the adapter model with the original base model. Here is a guide to do it on your local machine.

However, if you want to run the inference API, you can use a library that was made on purpose, called peft. In order to activate it, you need to specify it on the README of your uploaded model.

Currently there is a problem: if the base model needs an authentication token (e.g. LLama2), it won’t work: this is because the Peft inference API struggles to use your token to fetch the base model. Maybe I am doing something wrong, but in the end I didn’t manage to make it work. If someone finds a solution I would love to know.

Disclaimer: I am not an expert, I am a beginner. Just trying to save some other people the hassle I went through to find it out '-_-

2 Likes

Hi,

[ISSUE]: {model_name} does not appear to have a file named _config.json

Any updates from the HuggingFace team? @sgugger

I’m having the same issue.
Step 1: (using HF Spaces AutoTrain GUI)
I fine-tuned mistralai/Mixtral-8x7B-Instruct-v0.1 on my data using SFT.

Step2: (using HF Spaces AutoTrain GUI)
I’m trying to further fine-tune my SFT fine-tuned model on my data using DPO now.

Step3: (using HF Spaces AutoTrain GUI)
In the APP pane I upload my train.csv and click on Start Training

Step 4: I get back the following error
“”"
:x: ERROR | 2024-03-13 09:41:28 | autotrain.trainers.common:wrapper:92 - xxxxx/autotrain-xxxxx-xxxxx does not appear to have a file named config.json. Checkout ‘https://huggingface.co/xxxxx-xxxxx/autotrain-xxxxx-xxxxx/main’ for available files.
“”"

Doesn’t the config.json file get generated automatically when using AutoTrain? Should I create it myself? I’m not sure how. Is there a specific documentation page on the issue?

@sgugger could you please assist?

Many thanks :pray:t2:

I’m getting the same error, do you fix that problem @nadav-sellence?

Thanks! That worked.