In order to better understand what text-to-image models can do, I’d like to get the latent space representation of an image for a model that supports this and create a new image from that.
I’m curious how similar the result then is and I would think this gives me an understanding of what kind of image a model can create and what it can’t.
I’m very new to all this, and the textual_inversion.py
where this information may be hidden is way to complex for me to understand so far.
Does somebody have an example for this?