Environment: Google Colab (Pro Version using a V100) for training.
Tool: Utilizing Hugging Face AutoTrain for fine-tuning a language model.
Sequence of Events:
Initial Training:
Successfully trained a model using AutoTrain.
Process seemingly completed without errors, resulting in several output files.
Missing config.json:
Despite successful training, noticed that the config.json file was not generated.
Without config.json, the trained model cannot be loaded for inference or further training.
Manual Configuration:
Created a config.json manually based on a the ābase modelā used for fine-tuning (NousResearch/Llama-2-7b-chat-hf) plus additional training and adapter parameters derived from the fine-tuned modelās files AutoTrain uploads to the HF repository.
Uploaded this config.json to the Hugging Face repository where the model resides.
Upload to Repository:
Uploaded all relevant files, including pytorch_model.bin, adapter_config.json, adapter_model.bin, and others, to a Hugging Face repository named Kabatubare/meta_douglas_2.
Model Loading Error:
Attempted to load the model and encountered the following error:
vbnetCopy code
OSError: Kabatubare/meta_douglas_2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt, or flax_model.msgpack.
File Size Anomaly:
Noticed that the size of the uploaded pytorch_model.bin is only 888 Bytes, which is far smaller than what is typical for such files.
Repository File Structure:
adapter_config.json
adapter_model.bin
added_tokens.json
config.json (manually added)
pytorch_model.bin (888 Bytes, suspected to be incorrect or incomplete)
Training parameters (training_args.bin, training_params.json)
Specific Questions for the Hugging Face Community:
Configuration File: Why is a config.json not generated by AutoTrain by default? Is there a specific setting or flag that needs to be enabled to output this file?
File Size Issue:
What could cause pytorch_model.bin to be so small (888 Bytes)?
Could this be a symptom of an incomplete or failed save operation?
Manual Configuration:
Are there standard procedures or checks to verify that a manually created config.json is accurate?
Are there tools to validate the config.json against the actual PyTorch model file?
Error Resolution:
How to resolve the OSError encountered while loading the model?
Are there specific requirements for the directory structure when loading models from a Hugging Face repository?
Model Integrity:
Given the missing config.json and the small size of pytorch_model.bin, are there steps to verify the integrity of the trained model?
For those still wondering, I found this answer helpful:
In other words, by training with PeFT, you havenāt saved the whole model, but only the parameters that were updated by LoRA. This is called the āadapter modelā. In order to run the inference one the whole fine-tuned model, you need to merge the adapter model with the original base model. Here is a guide to do it on your local machine.
However, if you want to run the inference API, you can use a library that was made on purpose, called peft. In order to activate it, you need to specify it on the README of your uploaded model.
Currently there is a problem: if the base model needs an authentication token (e.g. LLama2), it wonāt work: this is because the Peft inference API struggles to use your token to fetch the base model. Maybe I am doing something wrong, but in the end I didnāt manage to make it work. If someone finds a solution I would love to know.
Disclaimer: I am not an expert, I am a beginner. Just trying to save some other people the hassle I went through to find it out '-_-
Iām having the same issue. Step 1: (using HF Spaces AutoTrain GUI)
I fine-tuned mistralai/Mixtral-8x7B-Instruct-v0.1 on my data using SFT.
Step2: (using HF Spaces AutoTrain GUI)
Iām trying to further fine-tune my SFT fine-tuned model on my data using DPO now.
Step3: (using HF Spaces AutoTrain GUI)
In the APP pane I upload my train.csv and click on Start Training
Step 4: I get back the following error
āā"
ERROR | 2024-03-13 09:41:28 | autotrain.trainers.common:wrapper:92 - xxxxx/autotrain-xxxxx-xxxxx does not appear to have a file named config.json. Checkout āhttps://huggingface.co/xxxxx-xxxxx/autotrain-xxxxx-xxxxx/mainā for available files.
āā"
Doesnāt the config.json file get generated automatically when using AutoTrain? Should I create it myself? Iām not sure how. Is there a specific documentation page on the issue?