Hello, I am using Dreambooth with the runwayml/stable-diffusion-v1-5 model and after executing it I am getting the following folders:
- logs
- safety_checker
- scheduler
- text_encoder
- tokenizer
- unet
- vae
Nothing else is being generated. Thus, when trying the make an inference the following error is obtained:
OSError: Error no file named model_index.json found in directory
The most interesting thing is that the pipeline that I had was working perfectly before upgrading the diffusers library. Thus, I assume that something has changed and now the flow training-inference is not working. I really appreciate any help with this issue. Thanks.
Hello @SrLozano!
I’m not able to reproduce this problem, a model_index.json
file is being correctly generated for me.
Could you please post the exact command line invocation you are using, as well as the output from diffusers-cli env
?
Pedro, thank you for taking the time to respond. I appreciate it.
The exact command line invocation I am using is the following:
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="/zhome/d1/6/diffusers/examples/dreambooth/images"
export OUTPUT_DIR="/zhome/d1/6/diffusers/examples/dreambooth/saved_model"
accelerate launch train_dreambooth.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--instance_data_dir=$INSTANCE_DIR \
--output_dir=$OUTPUT_DIR \
--instance_prompt="a photo of sks dog" \
--resolution=512 \
--train_batch_size=1 \
--gradient_accumulation_steps=1 \
--learning_rate=5e-6 \
--lr_scheduler="constant" \
--lr_warmup_steps=0 \
--max_train_steps=400
On the other hand, the output from diffusers-cli env
is:
--diffusers version: 0.17.0.dev0
--Platform: Linux-3.10.0-1160.88.1.el7.x86_64-x86_64-with-glibc2.17
--Python version: 3.8.13
--PyTorch version (GPU?): 1.13.1+cu117 (False)
--Huggingface_hub version: 0.14.1
--Transformers version: 4.28.1
--Accelerate version: 0.18.0
--xFormers version: not installed
--Using GPU in script?: Yes, I am using an A100 from a computing cluster
--Using distributed or parallel set-up in script?: No
I notice the error when I updated version 0.13.1 to 0.17.0. I have a script that automatically makes inferences to generate images from a model trained using Dreambooth. Everything was working smoothly. But I notice an error I realised that dreambooth is not longer outputing the same files it used to. I am using the recommended pipeline for inference.
from diffusers import DiffusionPipeline
import torch
model_id = "path_to_saved_model"
pipe = DiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")
prompt = "A photo of sks dog in a bucket"
image = pipe(prompt, num_inference_steps=50, guidance_scale=7.5).images[0]
image.save("dog-bucket.png")
Everything’s very similar to my setup, and in my case the model_info.json
file is indeed saved. Could you maybe open an issue in GitHub and reference this post so we can get more eyes on it?
Sure! I’ll do it! Thank you very much for your time Pedro.
Having same issue here.
- `diffusers` version: 0.17.0.dev0
- Platform: Linux-5.15.0-69-generic-x86_64-with-glibc2.27
- Python version: 3.10.9
- PyTorch version (GPU?): 2.0.1+cu117 (True)
- Huggingface_hub version: 0.14.1
- Transformers version: 4.29.2
- Accelerate version: 0.19.0
- xFormers version: not installed
- Using GPU in script?: 4090
- Using distributed or parallel set-up in script?: No
training `DreamBooth` using `Stable-Diffusion-v1-5`
The link is Dreambooth not generating model_index.json and thus is not able to make inference · Issue #3468 · huggingface/diffusers · GitHub apparently.
I read two more issues about the same problem which are just unstructured support requests and abridged solutions unfortunately and could extract the following theory:
The model_id
passed to DiffusionPipeline.from_pretrained(...)
is not the directory as you might think following the docs, but the identifier of the remote model which needs to be pushed to huggingface after training and downloaded to be used, i.e. there’s no local approach.
Maybe you can patch the files from the local cache ~/.cache/huggingface
or maybe this is all wrong, I’m just compensating the lack of comprehensive, clear docs with speculation, trial and error.