Why is a config.json not generated by AutoTrain by default? Is there a specific setting or flag that needs to be enabled to output this file?
File Size Issue:
What could cause pytorch_model.bin to be so small (888 Bytes)?
Could this be a symptom of an incomplete or failed save operation?
Are there standard procedures or checks to verify that a manually created config.json is accurate?
Are there tools to validate the config.json against the actual PyTorch model file?
How to resolve the OSError encountered while loading the model?
Are there specific requirements for the directory structure when loading models from a Hugging Face repository?
Given the missing config.json and the small size of pytorch_model.bin, are there steps to verify the integrity of the trained model?
Environment: Google Colab (Pro Version using a V100) for training.
Tool: Utilizing Hugging Face AutoTrain for fine-tuning a language model.
Sequence of Events:
Successfully trained a NousResearch/Llama-2-7b-chat-hfmodel using AutoTrain on a dataset (Kabatubare/frederick).
Process seemingly completed without errors, resulting in several output files. But there was a missing config.json file, making it impossible to use.
Despite successful training, noticed that the config.json file was not generated.
Without config.json, the trained model cannot be loaded for inference or further training.
Created a config.json manually (thanks GPT-4) based on the ‘base model’ used for fine-tuning (NousResearch/Llama-2-7b-chat-hf) plus additional training and adapter parameters derived from the fine-tuned model’s files AutoTrain uploads to the HF repository.
Uploaded this config.json to the Hugging Face repository where the model resides.
Upload to Repository:
Uploaded all relevant files, including pytorch_model.bin, adapter_config.json, adapter_model.bin, and others, to a Hugging Face repository named Kabatubare/meta_douglas_2.
Model Loading Error:
Attempted to load the model and encountered the following error:
OSError: Kabatubare/meta_douglas_2 does not appear to have a file named pytorch_model.bin, tf_model.h5, model.ckpt, or flax_model.msgpack.
File Size Anomaly:
Noticed that the size of the uploaded pytorch_model.bin is only 888 Bytes, which is far smaller than what is typical for such files.
Repository file structure had all these but no config.json:
config.json (manually added)
pytorch_model.bin (888 Bytes, suspected to be incorrect or incomplete)
Tokenizer files (tokenizer.json, tokenizer.model, etc.)
Training parameters (training_args.bin, training_params.json)