Hi, I’m faced with a confusing situation and would appreciate any advice from anyone. My colleague trained some lora models and deploy SD with webui. He then found the performance is not satisfying (due to old GPU machines) and I’m assigned the task to improve it a little bit. I’m new to the SD area, and after some digging I found there’re roughly two mainstream architectures, i.e., CompVis (or stability-ai or A1111) and diffusers. I compared the performance of 3 methods: CompVis’ scripts/txt2img.py (assuming it’s close to webui), diffusers and onediff. The result showed that performance increases in the above order. But webui provides lot’s of additional functionalities like prompt weighting, dynamic lora loading/unloading, text inversion out-of-the-box. I know they could be achieved using diffusers and maybe some 3rdparty libs or customized code. So I’m planing to make use of diffusers’ pipeline, and integrate webui’s functionalities. But I’m not sure whether it’s worthy.
The SD world is evolving quickly, led by Stability AI, CompVis or RunwayML, and followed by the vast community, including HuggingFace. If I did it this way, next time when new features emerges I would have to wait for diffusers to catch up with it or make it myself, which is time or energy consuming. So I’m not sure whether this is a good approch for me.
To my knowledge, optimizations for SD include xformers, attention slicing, pytorch 2.0’s sdp. (Correct me if I’m mistaken. And I’d like to know if tensorrt or Nvidia’s FasterTransformer could make it further.) These are all included in diffusers and webui, so I’m not sure what causes the performance gap.
The same problem confused me. Any good solution have you found?
Our online services by now still use WebUI. But we’re planing to make Diffusers the default.
In my own experience, developing with Diffusers is way easier than with WebUI, since Diffusers provides more clear programing APIs. So, if you’re developing API services rather than a demo for non-developers to play with and show off your research achievements, I would recommend Diffusers.
IMO, However, due to the fast development of the Diffusion area, more and more functionalities would be integrated to the code base, and thus challenges the code design capabilities of developers. It’s very likely that Diffusers would become more and more complicated. Currently the different pipelines don’t share common code, which could be a big problem to maintainence. But till now, I still choose to work more with Diffusers than WebUI.
As for the speed of functionality integration, I think there’s no need to worry about Diffusers. Since it’s more easier to develop with, the community would quickly make things happen on it.
Furthermore, if you need to optimize the model, e.g., with TensorRT, it’s also easier to implement with Diffusers pipelines.
I agree your idea. But diffusers will produce some difference on the images comparing to webui’s. Maybe much time or energy will be needed to cover the gap. I think a git repo will be a good choice for ones who want a webui on diffusers.