I’m working on a personal project where I want to generate custom emoji-style images from text prompts — like turning this:
Flying pig
→ with wings
I’m using black-forest-labs/FLUX.1-dev as the base model. It’s a diffusion model similar to Stable Diffusion, but optimized for low-VRAM generation.
What I have:
~25k 512x512 emoji-style images
Captions for each (in .txt files)
A train.json mapping image to caption
dataset/
├── images/image_001.png,...
├── captions/caption_001.txt,...
└── train.json # [{ "image": "images/image_001.png", "caption": "captions/caption_001.txt" }, ...]
What I need help with:
How many images is “enough”? Is 25k too much or just fine?
Any working training script for FLUX.1?
I tried one (PyTorch + diffusers), but outputs look like noise.
Best training config?
Should I freeze VAE/text encoder?
Recommended batch size, LR, etc?
How do I export the model to ONNX or TFLite?
Planning to use it in a Flutter app later.
A sample setup, script or any advice would be helpful for beginners to get started.
1 Like
1
There are enough. Basically, always, more is better.
2
These scripts are well known. If you search for them, you will find a huge amount of know-how, so I recommend that you search for know-how first.
Contribute to bmaltais/kohya_ss development by creating an account on GitHub.
OneTrainer is a one-stop solution for all your stable diffusion training needs.
3
The model you are trying to create this time is a little more specialized than, for example, imitating someone’s face, so it might be better to look for a similar use case (like recreating a painting style or a similar deformation?) and use that as a reference for the parameters.
4
I’ve never seen anyone export FLUX, but if it’s possible, I think this is how you do it…
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ONNX Runtime
🤗 [Optimum](https://github.com/huggingface/optimum) provides a Stable Diffusion pipeline compatible with ONNX Runtime. You'll need to install 🤗 Optimum with the following command for ONNX Runtime support:
```bash
pip install -q optimum["onnxruntime"]
```
This file has been truncated. show original
1 Like
sheep8
April 15, 2025, 10:07am
3
Can be optimized and trained through proxy ip