Hi!
I’ve got around 4x machines for my office and we’re constantly pulling resources from HuggingFace to try out new models, train against datasets and so on.
With Docker there supports a way to create cache registries, so we have a Docker Registry cache server that has all the images locally and our servers can pull like normal from dockers hub or other hubs and if it is cached we’re able to save on bandwidth.
With HuggingFace we’re trying to do the same thing, understanding there are multiple ways to fetch data from HuggingFace, with Git LFS and your more optimised parallel downloader I wanted to see if anyone else has come up with or theorised a caching MITM?
Ideally I’d love to simply direct traffic that otherwise would be a huggingface file repository to a local server which can pull the files without having to change code so Libraries and existing servers should still try to reach original HuggingFace servers but on a host level or environment variable level we can instruct it to try a proxy, and save some bandwidth. Understanding there are security concerns and the hacky way to do this requires root authority over the traffic to cache assets through https, I kind of don’t want to go this complex, otherwise this is local and not public so I trust the infrastructure within to deal with temporary caching.
I know Huggingface has a .cache folder, I can’t easily mount that over a network reliably.
Love to see what you guys think, if nothing exists I am keen to implement something.