Like playing with image/video gen. Used to do it with ComfyUI. Amazing tool that’s reasonable easy to use. Was digging in ways to run Flux faster than ComfyUI on my Mac M1 8GB. 8GB isn’t a lot, but it’s doable, so trying to save as much memory as possible and a whole backend + advance ComfyUI frontend didn’t help increase the available RAM for the diffusion process. This is my code:
It works and generates images at 100s/it (instead of 300-400s/it on ComfyUI), but I would like to speed it up a bit, by using quantized T5 xxl, just as I did in ComfyUI (and would like to be able to use the same files as ComfyUI, like I have done with the UNET).
Since T5EncoderModel is part of Transformers rather than Diffusers, it should be fine to load and use it as a Transformers model, but there may still be some bugs in the GGUF part.
If you don’t try to use the same file, you could load a different file using a different quantization method…
One possibility would be to first dequantize it and then quantize it on the fly in a different format?
How about torchao or bitsandbytes or optimum-quanto ?
3
Diffusers+Transoformers, ComfyUI, and A1111 WebUI are all completely different programs. Although their purposes and results are largely the same, their implementations are different. Compatibility is provided for convenience, but it is better to convert them in advance to avoid problems.