Files required for offline running

ashic · December 13, 2022, 12:08am

Due to various corporate network hoops, downloading models on the is not possible. Searching posts in this regard, I can see that others have downloaded and saved the models locally, but again, the initial download is a problem. At this point, I can see two options: git lfs to download the repo, and the huggingface_hub python library.

For the git lfs option, I think it downloads the whole repository (well, one commit deep). As an example, I’m looking at: sshleifer/distilbart-cnn-12-6 · Hugging Face . I see this has a pytorch model, a rust model, and a msgpack model - each over 1GB in size. I’m looking to use this for local development, as well as embedding in a container to run on servers (so as not to download on each run - which would be a massive waste). The git lfs option downloads the whole thing.

With the huggingface_hub library, I can select individual files. And this seems to work on our network. However, I’m a bit confused about which files I need to be able to load the model. We’re running with pytorch - so can I just download the pytorch bin model? Or would I need the msgpack and rust models too? Would I require the config.json, merges.txt, tokenizer_config.json, and vocab.json as well?

Topic		Replies	Views
Do I need the objects under .git/lfs if I just want to use current version of models? 🤗Hub	2	1197	March 6, 2023
Download models for local loading Beginners	11	96172	March 18, 2024
Can't download (some) models although they are in the hub Beginners	6	4480	January 30, 2022
How Can I use cashed models from HuggingFace? 🤗Transformers	0	240	November 9, 2023
How to download current version models only? Models	1	472	March 1, 2023

Files required for offline running

Related topics