Smaller pretrained models for Stable Diffusion?

As I restart my program for the hundredth time, I think: maybe if I used a smaller model while developing, I wouldn’t spend so long waiting for it to load.

Are there smaller pretrained models available that work with the same :firecracker:diffusers StableDiffusionPipeline?

I looked at a few of the options at Models - Hugging Face but they seem to be mostly fine-tunings of the SD models and they’re not any smaller, and/or they don’t have data published for fp16.

1 Like

Hi @keturn!

If you just want fast loading and don’t care about generation you could try to use this tiny model for development. It’s not trained or anything so predictions will be useless, but it could be helpful when you are working on new features.

You can use it like this:

from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-pipe")

If you do need some reasonable outputs, then I’m not sure what would be the best option.

Have you tried passing local_files_only=True to from_pretrained? I think the slow part is the fact that is checks for updates online. I’ve also serialized the Pipeline to disk and then load time is just deserialization, which is pretty fast.

Hmm, what makes you think that? I never suspected that might be a problem, since these files aren’t getting updates often.

I tried local_files_only and it saves about a second (less than 10% of the total time).

How is that different than the default implementation? Do you have a serialization format that’s lots faster to load than pytorch’s pickle-based format?

I don’t need anything impressive, but yes, something like reasonable outputs would be nice so I can tell whether or not I’ve completely biffed up the inputs or processing.

Thank you both for the suggestions!

Hmm, what makes you think that?

Just an assumption. Mostly wrong it looks like.

How is that different than the default implementation? Do you have a serialization format that’s lots faster > to load than pytorch’s pickle-based format?

Plain old pickle binary format. But I’m serializing the output of from_pretrained thereby paying that cost only once. Seems about 5 times faster give or take (this is on an NVME drive mind you)