Image2image SDLX finetuning using python

Hi guys,

Don’t know which category I should pick but… Recently I faced with an issue. I want to create a script to fine-tune SDXL on my local machine using GPU. I want to train the model on the set of images with masking and without. Basically the task is that it has an image and in a certain region it generates polygons or something I want to place there. I have this paired dataset already.

The result should looks as following. In the inference.py I give an image and the prompt, and model modifies the image.

I started to surf the internet to have an example for this type of generation. But unfortunately all I see is a low-code or no-code solutions. Or solutions which got somewhat outdated due to multiple changes introduced by HuggingFace hub.

I tried multiple ways how to make it but all the time I’m facing with some weird errors, which I’m not always able to address.

If you have a code snippet that could help me out to fine tune model, save it and then inference, please let me know.

:pray::pray::pray: Thank you in advance!

1 Like

Is there anything similar to Inpainting or existing ControlNet?

SDXL is not a model architecture specialized for image processing, so I don’t think there are many options.

Is it possible to finetune Kadinsky for certain image generation?

1 Like

It seems that fine-tuning is possible, but the sample is old, so it is unknown whether it will work as is. Basically, in this type of model, the training results for Text-to-Image also directly affect Image-to-Image.