SDXL custom pipeline - Input to unet? - Why 2 text encoders?

ingo-m · August 14, 2023, 11:39am

@asrielh no unfortunately I haven’t made any progress on this. Conceptually my understanding for stable-diffusion-v1-4 is that the components are connected like this: tokenizer → text_encoder → unet → vae. I can’t make sense of the two two text encoders & tokenizers in SDXL.

Topic		Replies	Views
Did SDXL-inpainting fine-tune the text_encoder? Beginners	0	131	April 29, 2024
Use prompt tokens instead of prompt for sdxl? for the purpose of interpolation 🧨 Diffusers	0	203	April 2, 2024
Text_encoder_2, local model, not working 🧨 Diffusers	1	1333	May 25, 2024
Add additional trainable layers to StableDiffusion for fine-tuning 🧨 Diffusers	0	1018	October 8, 2023
Access CLIP from StableDiffusionPipeline and use the same models for multiple pipelines 🧨 Diffusers	3	2628	October 11, 2023

SDXL custom pipeline - Input to unet? - Why 2 text encoders?

Related topics