Lora + models = scene with person: Making anime action scenes with family faces

Goal: AI generated images of my family in artistic (anime/drawn style) action scenes.

Technical: I have LoRAs of my family’s faces, and obviously any model I want from huggingface. Using this I made two attempts to generate images:

  1. Use models biased towards drawings to get action images (Good, depending on prompting) + yolo11 face recognition (masks are ok 60% of the time) + inpainting with the lora file of note and the family member’s name with various weights. It’s this last step that fails horribly. I don’t/cant’ get a recognizable face inpainted in the images in question(self.inpaint_pipe(prompt, negative_prompt, image, mask_image, …, guidance=7.0, strength=0.6)) despite some pretty heavy tuning of / playing with the parameters. Is this a reasonable path or do I overestimate how much can get done here to get drawing style images out of real-world photos/loras without other steps?
  2. Attempt two, separate from 1, is to just combine the lora up with the model and ask for the whole scene. This result varies heavily by the model, but generally I can get high quality boring scenes with recognizable faces (photo realistic) or low quality non-recognizable faces in action anime scenes. The code rather no different than a generic comfyui wiring model→lora + empty latent → ksample → vae decode → save.

How should I approach this problem? I’m looking more for a DIY than a service and the social media forums are all about hocking services it seems.

1 Like

It’s quite difficult to accomplish this solely with LoRA, so in the case of SDXL, I recommend using it in combination with IP-Adapter or ControlNet, or utilizing newer architecture OSS models.

@John6666 Thank you for such a detailed and linked write up. Time presses on us all, but I am working through all your words and taking another pass at this work. That write up must have been quite the time investment.

1 Like