Hello, I just wanted to understand a bit better about the spirit of the license and the tokens.
First, what are the use tokens doing? Does every colab or local bit of python that accesses the token stay connected to huggingface servers somehow? Is it transferring information about use? Is it just used to download some source temporarily?
Second, is the available code sufficient to reproduce the same results without the use of the huggingface token? Or are essential parts of the api locked behind something that gets accessed with the tokens?
Is the intent basically so that anything that integrates this code will require hugging face token logins?
I only tried the inference functionality that requires the token, so as far as I know, the token is needed to use some functionalities to the API, such as inference. However, if your computer can store and run the model, you can download and run it locally, without needing to use the token. The results obtained should be as promising as the inference function.
This link can be of your interest: Installation. See that you can execute the model completely offline, just using your local files.
For inference, the token is only used to download the model checkpoint during the first call to
StableDiffusionPipeline.from_pretrained(). After that, the checkpoint is cached in
~/.cache/huggingface/diffusers. When you re-run the code, diffusers will check the cache first, so as long as it is there, you don’t need to supply the token again.
If you don’t have persistent storage, such as running a Colab notebook, you will need the token each time to re-do the download. As an alternative, you can use GoogleDrive to store the token and the checkpoint to save from having to redownload.
The “Connect to Google Drive” and “Connect to Hugging Face” cells in the StableDiffusion quickly Colab notebook has example code for caching both the token and the model.
Thank you both for your help. I have another couple of questions if I may.
- Is it possible to control where the files get cached on desktop platforms?
- Is it possible to easily retrieve the cache location on a platform without searching or stepping through code?
- Is it possible to disable the obfuscation of file names and file links in the cache?
- If someone already has downloaded the model for running a different local application or notebook or something, is there a way to tell it to load the model from there, and use the cache for everything else? (I would just use the symbolic links like the stablediffusion-quickly notebook, but that solution doesn’t work in every case (for instance on windows, a symbolic link doesn’t work across drives))
- Is there diffusers documentation somewhere for the entire api?
Basically, I’d like to integrate this into a larger system and add functionality, so I need to be able to see and edit code, and it seems like it’s designed to prevent anyone from easily messing with anything.
.from_pretrained() will accept a directory path to load the model from. There is also an argument called
cache_dir that specifies where to save/load the cached model from.
You can get the location of the cache with
from diffusers import utils
Diffusers is probably best approached as a cutting edge research tool rather than package with a well documented API (especially since the API design is still an open discussion). However, the code base is not very big so you can learn a lot just by looking at the source code of the modules (pipelines, schedulers, models) that you are interested in.
Also, the model can be downloaded outside of the diffusers library with
git lfs install
git clone https://huggingface.co/CompVis/stable-diffusion-v1-4
I think the intent is to make sure users are complying with the SD license. If a user of your code is supplying their own HF token, then they’ve accepted the license, which simplifies permissions. But if it’s part of a larger system, the developer needs to make sure that the redistribution terms are all being followed (Section III).
Thanks again. You’ve been an enormous help.