Trying to push my model back to the hub from python (not notebook) and failing so far:
I am using a T5 model with the latest development version of the example “run_summarization.py” and pass a load of runtime parameters in and my model works fine. There are some parameters that seem to relate to pushing the model back to the hub which I have identified from the “run_summarization.py -h” text:
–use_auth_token - Will use the token generated when running transformers-cli login (necessary to use this script with private models). (default: False) -I assume I need to set this True given I ran the cli and it saved my token in the cache?
–push_to_hub - Whether or not to upload the trained model to the model hub after training. (default: False) - I set this to True
–push_to_hub_model_id - The name of the repository to which push the Trainer. (default: None) - *I set this to a string that is my model like “my_model” I guess? *
–push_to_hub_organization - Not relevant for me since I am an individual?
–push_to_hub_token - Not needed if I set --use_auth_token True
OSError: Tried to clone a repository in a non-empty folder that isn’t a git repository. If you really want to do this, do it manually:\mgit init && git remote add origin && git pull origin main or clone repo to a new folder and move your existing files there afterwards.
As I said above, I did transformers-cli login successfully in my environment. I thought maybe I needed to do as I had seen in an example Colab notebook:
subprocess.CalledProcessError: Command ‘[‘git-lfs’, ‘–version’]’ returned non-zero exit status 1.
But not sure if needed (I am guessing)! I can supply the Trace for both kinds of errors above if needed, but I don’t know what minimal configuration works running a .py file to see if I am being a dumb user and the problem is usage or the problem is something else.
Any help on correct usage appreciated or point me to a working example? Thanks!
As the error indicates, you are trying to clone an existing repository in a folder that is not a git repository, so you should use an empty folder, or an ID for a new repository.
Sorry @sgugger , could you provide a bit more explanation please? I am no Github expert and other than the error message I didn’t know that I was trying to clone an existing repository, so I don’t know how to interpret the message at all (yes I am a numpty, sorry!). Could you point me to a runtime parameter configuration that I could expect to work? For example, the following doesn’t work and produces the same error relating to github:
My expectation from the above is that it should create and save my model as “t5_tuesday” under my account as TheLongSentance/t5_tuesday. Despite the error, this actually seems to happen on checking my huggingface account but I still get the github error within the run?
And do I need to do the two !git config statements?
And finally do I need to !pip install hf-lfs or is that not needed any more?
Change your output_dir or delete its content before re-running the script, or use --overwrite_output_dir to remove its content.
The problem is that you have a non-empty output_dir in which you are trying to clone an existing model.
Also make sure you are on the latest version of the script and Transformers.
Same problem here. It’s unclear what is happening. I watched the video on this and this error didn’t come up, so it is hard to piece it all together. It seems like push_to_hub should just push it to hub.
Is there a sequence of steps that new users can follow? Something real basic that doesn’t leave anything out.
Had issue when i used PEFT+QLoRA and tried to save Adapter of the model with:
model.push_to_hub(...).
→ the call actually creates empty repo on huggingface, and ends with above mentioned error OSError: Tried to clone a repository in a non-empty folder that isn’t a git repository.
Then i changed to:
trainer.model.push_to_hub(...).
And it worked, the adapter files was saved to huggingface.
Thanks @honzatoegel, this fix worked for me! The model was saved as a .bin file, however I’m confused by the documentation–is it the model’s state dict that’s stored, or is it something else?