Unconditional Latent Diffusion using AutoencoderKL

Hello everyone,

I’m new to the field and currently working on a project that involves image generation. I have a dataset that I’ve already encoded into latent representations using a pre-trained AutoencoderKL.

Now, I want to train a UNet model using this encoded dataset. I came across this example code https://github.com/huggingface/diffusers/blob/main/examples/unconditional_image_generation/train_unconditional.py) for training an unconditional image generation model and found it quite relevant for my task.

However, I’m a bit confused about how to integrate my encoded dataset into this workflow.

Here are my specific questions:

  1. How should I modify the data loader in the example code to load my encoded dataset instead?
  2. Are there any additional pre-processing steps I should consider before feeding the encoded dataset into the UNet model?
  3. Is it advisable to fine-tune the UNet model using this encoded dataset? If so, what are some best practices?

If there’s any additional information that would help you assist me, please let me know.

Thank you for taking the time to read my post. I appreciate any help or guidance you can offer!

Best regards,
Z