Training Flux Lora Failed

I used the steps:
youtube /watch?v=bAmsJGOkkkw

Got this error 3 times:
Training failed.
Command ‘[‘python’, ‘run.py’, ‘config/replicate.yml’]’ returned non-zero exit status

Can someone pls help?
LOG:
1.

Starting prediction
The token has not been saved to the git credentials helper. Pass add_to_git_credential=True in this function directly or --add-to-git-credential if using via huggingface-cli if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful
Detected zip file
Extracted zip file
Running 1 job
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache().
0it [00:00, ?it/s]
0it [00:00, ?it/s]
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module ‘mediapipe’ is not installed. The package will have limited functionality. Please install it using the command: pip install ‘mediapipe’
warnings.warn(
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
{
“datasets”: [
{
“cache_latents_to_disk”: true,
“caption_dropout_rate”: 0.05,
“caption_ext”: “txt”,
“folder_path”: “input_images”,
“resolution”: [
512,
768,
1024
],
“shuffle_tokens”: false
}
],
“device”: “cuda:0”,
“model”: {
“is_flux”: true,
“name_or_path”: “black-forest-labs/FLUX.1-dev”,
“quantize”: true
},
“network”: {
“linear”: 16,
“linear_alpha”: 16,
“type”: “lora”
},
“sample”: {
“guidance_scale”: 4,
“height”: 1024,
“neg”: “”,
“prompts”: [
“a sign that says ‘I LOVE PROMPTS!’ in the style of [trigger]”
],
“sample_every”: 250,
“sample_steps”: 20,
“sampler”: “flowmatch”,
“seed”: 42,
“walk_seed”: true,
“width”: 1024
},
“save”: {
“dtype”: “float16”,
“max_step_saves_to_keep”: 1,
“save_every”: 2011
},
“train”: {
“batch_size”: 1,
“content_or_style”: “balanced”,
“dtype”: “bf16”,
“ema_config”: {
“ema_decay”: 0.99,
“use_ema”: true
},
“gradient_accumulation_steps”: 1,
“gradient_checkpointing”: true,
“lr”: 0.0004,
“noise_scheduler”: “flowmatch”,
“optimizer”: “adamw8bit”,
“steps”: 2010,
“train_text_encoder”: false,
“train_unet”: true
},
“training_folder”: “output”,
“trigger_word”: “TOK”,
“type”: “sd_trainer”
}
Using EMA
/src/extensions_built_in/sd_trainer/SDTrainer.py:61: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead.
self.scaler = torch.cuda.amp.GradScaler()
#############################################

Running job: flux_train_replicate

#############################################
Running 1 process
Loading Flux model
Loading transformer
Error running job: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.

Result:

  • 0 completed jobs
  • 1 failure
    ========================================
    Traceback (most recent call last):
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py”, line 304, in hf_raise_for_status
    response.raise_for_status()
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/requests/models.py”, line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
    requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json
    The above exception was the direct cause of the following exception:
    Traceback (most recent call last):
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1751, in _get_metadata_or_catch_error
    metadata = get_hf_file_metadata(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
    return fn(*args, **kwargs)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1673, in get_hf_file_metadata
    r = _request_wrapper(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 376, in _request_wrapper
    response = _request_wrapper(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 400, in _request_wrapper
    hf_raise_for_status(response)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py”, line 367, in hf_raise_for_status
    raise HfHubHTTPError(message, response=response) from e
    huggingface_hub.utils._errors.HfHubHTTPError: (Request ID: Root=1-66f67aeb-2e5d11046011068a4df3d290;aa99d9fa-3cee-45ea-a1e0-3eeb3edb7e68)
    403 Forbidden: Please enable access to public gated repositories in your fine-grained token settings to view this repository…
    Cannot access content at: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json.
    If you are trying to create or update content, make sure you have a token with the write role.
    The above exception was the direct cause of the following exception:
    Traceback (most recent call last):
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/configuration_utils.py”, line 379, in load_config
    config_file = hf_hub_download(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py”, line 101, in inner_f
    return f(*args, **kwargs)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
    return fn(*args, **kwargs)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1240, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1347, in _hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1857, in _raise_on_head_call_error
    raise LocalEntryNotFoundError(
    huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
    During handling of the above exception, another exception occurred:
    Traceback (most recent call last):
    File “/src/run.py”, line 90, in
    main()
    File “/src/run.py”, line 86, in main
    raise e
    File “/src/run.py”, line 78, in main
    job.run()
    File “/src/jobs/ExtensionJob.py”, line 22, in run
    process.run()
    File “/src/jobs/process/BaseSDTrainProcess.py”, line 1230, in run
    self.sd.load_model()
    File “/src/toolkit/stable_diffusion_model.py”, line 469, in load_model
    transformer = FluxTransformer2DModel.from_pretrained(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
    return fn(*args, **kwargs)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/models/modeling_utils.py”, line 612, in from_pretrained
    config, unused_kwargs, commit_hash = cls.load_config(
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
    return fn(*args, **kwargs)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/configuration_utils.py”, line 406, in load_config
    raise EnvironmentError(
    OSError: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.
    Traceback (most recent call last):
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/cog/server/worker.py”, line 349, in _predict
    result = predict(**payload)
    File “/src/train.py”, line 74, in train
    subprocess.check_call([“python”, “run.py”, “config/replicate.yml”], close_fds=False)
    File “/root/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py”, line 369, in check_call
    raise CalledProcessError(retcode, cmd)
    subprocess.CalledProcessError: Command ‘[‘python’, ‘run.py’, ‘config/replicate.yml’]’ returned non-zero exit status 1.
1 Like

how did you fix this?

Same here. Absolute beginner too, help is gratefully accepted :wink:

Flux dev is a gated model, so you’ll have to get permission from the following page to use it.