Textual inversion for one image only

In order to better understand what text-to-image models can do, I’d like to get the latent space representation of an image for a model that supports this and create a new image from that.

I’m curious how similar the result then is and I would think this gives me an understanding of what kind of image a model can create and what it can’t.

I’m very new to all this, and the textual_inversion.py where this information may be hidden is way to complex for me to understand so far.

Does somebody have an example for this?