I am currently working on training a Stable Diffusion model for image generation, and I have encountered a few challenges that I believe your expertise can help me overcome.
I am in the process of training a Stable Diffusion model using a unique dataset. This dataset consists of images grouped into multiple categories, with each category containing only a single image. Importantly, these images do not have any added noise, and I have also provided captions for each image. The training process is being conducted on Dreambooth.
One of the initial obstacles I encountered was GPU memory exhaustion due to the substantial dataset size. As a workaround, I decided to train the model on a smaller subset of the dataset
Challenges and Issues:
- Generating images from the model works relatively well for a single category of objects, problems arise when attempting to combine objects from two different categories. The combined results are not satisfactory, and the generated images often exhibit cropping, displaying only a fraction of the intended scene instead of the complete room.
I am seeking guidance on improving the results when combining objects from different trained categories in a single generated image. The current outcomes do not meet my expectations.
I am struggling to generate complete room images rather than partially cropped images. The issue lies in the generated images consistently displaying only a portion of the desired scene.
My primary objective is to enhance the quality of generated images when working with a single object category. I aim to produce images that closely resemble the objects on which the model was originally trained.
I would greatly appreciate your insights, recommendations, and strategies related to:
- Any necessary adjustments to hyperparameters to enhance the model’s performance.
- Augmentation techniques or data preprocessing methods that could potentially improve results.
- Suggestions for modifying the training process to better accommodate single and combined object categories.
- Insights or techniques for generating complete room images successfully.
Thank you for your contribution.