Chapter 4 questions

sgugger · June 14, 2021, 2:53pm

Use this topic for any question about Chapter 4 of the course.

khalidsaifullaah · June 30, 2021, 4:04pm

Can someone take a look at the chapter 4 notebook. The .push_to_hub() method isn’t working, at first the error was about ‘git-lfs’ and even downloading that doesn’t seem to work…

Thanks

sgugger · June 30, 2021, 4:16pm

Could you tell us more? Which notebook are you running? When does it fail?

khalidsaifullaah · June 30, 2021, 5:14pm

This section’s colab notebook: Sharing models and tokenizers - Hugging Face Course

When I execute model.push_to_hub("dummy-model") it throws error, although I’ve tried to solve it by sudo apt-get install git-lfs but it still doesn’t work.

By the way, I’m running all my codes in Google Colab, so I think you could reproduce the error by running the code from colab

Thanks

osanseviero · June 30, 2021, 5:46pm

@khalidsaifullaah I see the following error
ValueError: If not specifying clone_from, you need to pass Repository a valid git clone.

Do you see the same? If yes, for the time being, setting use_temp_dir=True in the push_to_hub params solved the issue for me.

khalidsaifullaah · June 30, 2021, 5:51pm

Yeah, I saw this one as well. Thanks for the solution, I’ll try using it now…
But I was actually wondering, @sgugger didn’t face these errors when he ran these same codes in the “Push To Hub” video (maybe it’s something to do with colab’s dependencies? in the video he used jupyter notebook)

sgugger · June 30, 2021, 6:12pm

I have updated the install instructions in the notebooks to reflect all the necessary steps. Could you try again and tell me if it is working (on the latest version of the colab).

khalidsaifullaah · June 30, 2021, 6:55pm

thanks @sgugger!
just checked the notebook, it’s working fine now!

khalidsaifullaah · July 5, 2021, 6:16am

I’m getting this even though git lfs is installed

lewtun · July 5, 2021, 10:45am

hey @khalidsaifullaah could you please share a minimal example so i can try to reproduce the error on my side?

khalidsaifullaah · July 5, 2021, 1:02pm

thanks for the quick response @lewtun!
I was actually trying to pretrain roberta model on GCP’s TPU using HF’s Roberta Flax trining script. I’ve followed the following steps to do it - transformers/examples/flax/language-modeling at master · huggingface/transformers (github.com)

For the time being, I sidetracked the error by removing --push_to_hub flag when running the training script.

lewtun · July 5, 2021, 1:16pm

ok thanks for the info! since the flax integration is quite new in transformers it’s possible there are some rough edges when it comes to integration with the hub.

i’ll try to reproduce the error and report back

ps. you should be able to push the model to the hub using plain old git-lfs if you really need it

khalidsaifullaah · July 5, 2021, 4:57pm

Thanks @lewtun! Really appreciate your support.

Just wanted to be clear on one thing-
As I’ve removed the push_to_hub flag, after every epoch the model.save_pretrained() method saving the checkpoints in my local directory. Now, when my training will be done, should I just do the following to upload everything to my model hub directory?

git add .
git commit -m "model trained"
git push origin main

Are any other commands necessary (like git lfs)? If so, in which order should it go, could you maybe give some suggestions on it?

Thanks

lewtun · July 5, 2021, 5:13pm

yes, for files larger than 10MB you’ll need to run git lfs track before git add, e.g.

git lfs track some_large_file.huge
git add .gitattributes
git add some_large_file.huge
git commit -m "add model files"

hth!

khalidsaifullaah · July 5, 2021, 5:54pm

Thanks a lot!

khalidsaifullaah · July 6, 2021, 4:48pm

Getting this when tried to push in the hub. I did git lfs track flax_model.msgpack and git lfs *tfevents* before commit and push…

khalidsaifullaah · July 6, 2021, 6:14pm

finally was able to push with the help of this - Failed to push model repo · Issue #8504 · huggingface/transformers (github.com)

dk-crazydiv · July 8, 2021, 6:21pm

Hi team,
Going through the last part got me thinking on some questions regarding quota & limits:

Is there any limit number of repos a user can have private and public?
Is there any limit to the size an individual dataset/model repo can have? Or a per-account limit? (eg: On Kaggle, each user gets a certain fixed GBs to host and sum total should remain within limit.)
If I do a 1000 commits of a 1GB model, is 1TB going to be ‘always-accesible’, or we have some stack limitations wrt git history?
Is there any limit on number of downloads per model (specifically a privately uploaded model)?

Questions are not necessarily as to what’s supported right now, but with some near-future perspective as well. eg: If I upload a public/private model (hypothetically for both commercial/non-commercial use) and not do the inference-api (just storage), will there be any threat to the 1)stability 2)scalability of such a pipeline?

lewtun · July 9, 2021, 3:29pm

hey @dk-crazydiv in the near-to-mid future, there are no limits

dk-crazydiv · July 9, 2021, 10:25pm

Thank you @lewtun, but I am imagining myself crossing the gartners hype cycle and identifying the plateau on which the offering lands. Even though the fan in me would love to see this possible, but practically it raises many concerns. Could you please elaborate a bit?
No limits on

number of private repos on modelhub/datasethub
number of public repos
size of the repos
number of commits on those repos
number of downloads from those repos
speed cap on downloads of those repos.

And also if someone subjectively “abuses” the policies and takes “unfair” advantage, does HF hold the right to ban? If yes, then it becomes even more concerning, as it is very very subjective.

Before going deep into specific use cases, it will be great if you could point me to policies and terms of use, as it probably will clear up many of the cases I am thinking of.

Topic		Replies	Views
[Announcement] Model Versioning: Upcoming changes to the model hub Models	34	15205	December 4, 2020
PushToHubCallback not uploading the model on huggingface automatically 🤗Transformers	10	1445	May 12, 2022
Issue uploading model: "fatal: cannot exec '.git/hooks/pre-push': Permission denied" Beginners	5	4057	September 23, 2021
Push_to_hub Status Beginners	2	1543	December 26, 2022
Using push_to_hub API, notebook crashed and whole process failed 🤗Transformers	0	152	January 17, 2023

Chapter 4 questions

Related topics