Training Flux Lora Failed

Kennikenino · September 27, 2024, 9:46am

I used the steps:
youtube /watch?v=bAmsJGOkkkw

Got this error 3 times:
Training failed.
Command ‘[‘python’, ‘run.py’, ‘config/replicate.yml’]’ returned non-zero exit status

Can someone pls help?
LOG:
1.

Starting prediction
The token has not been saved to the git credentials helper. Pass add_to_git_credential=True in this function directly or --add-to-git-credential if using via huggingface-cli if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /root/.cache/huggingface/token
Login successful
Detected zip file
Extracted zip file
Running 1 job
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling transformers.utils.move_cache().
0it [00:00, ?it/s]
0it [00:00, ?it/s]
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/mediapipe_face/mediapipe_face_common.py:7: UserWarning: The module ‘mediapipe’ is not installed. The package will have limited functionality. Please install it using the command: pip install ‘mediapipe’
warnings.warn(
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_5m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_5m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_11m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_11m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_224 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_224. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_384 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_384. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/controlnet_aux/segment_anything/modeling/tiny_vit_sam.py:654: UserWarning: Overwriting tiny_vit_21m_512 in registry with controlnet_aux.segment_anything.modeling.tiny_vit_sam.tiny_vit_21m_512. This is because the name being registered conflicts with an existing name. Please check if this is not expected.
return register_model(fn_wrapper)
{
“datasets”: [
{
“cache_latents_to_disk”: true,
“caption_dropout_rate”: 0.05,
“caption_ext”: “txt”,
“folder_path”: “input_images”,
“resolution”: [
512,
768,
1024
],
“shuffle_tokens”: false
}
],
“device”: “cuda:0”,
“model”: {
“is_flux”: true,
“name_or_path”: “black-forest-labs/FLUX.1-dev”,
“quantize”: true
},
“network”: {
“linear”: 16,
“linear_alpha”: 16,
“type”: “lora”
},
“sample”: {
“guidance_scale”: 4,
“height”: 1024,
“neg”: “”,
“prompts”: [
“a sign that says ‘I LOVE PROMPTS!’ in the style of [trigger]”
],
“sample_every”: 250,
“sample_steps”: 20,
“sampler”: “flowmatch”,
“seed”: 42,
“walk_seed”: true,
“width”: 1024
},
“save”: {
“dtype”: “float16”,
“max_step_saves_to_keep”: 1,
“save_every”: 2011
},
“train”: {
“batch_size”: 1,
“content_or_style”: “balanced”,
“dtype”: “bf16”,
“ema_config”: {
“ema_decay”: 0.99,
“use_ema”: true
},
“gradient_accumulation_steps”: 1,
“gradient_checkpointing”: true,
“lr”: 0.0004,
“noise_scheduler”: “flowmatch”,
“optimizer”: “adamw8bit”,
“steps”: 2010,
“train_text_encoder”: false,
“train_unet”: true
},
“training_folder”: “output”,
“trigger_word”: “TOK”,
“type”: “sd_trainer”
}
Using EMA
/src/extensions_built_in/sd_trainer/SDTrainer.py:61: FutureWarning: torch.cuda.amp.GradScaler(args...) is deprecated. Please use torch.amp.GradScaler('cuda', args...) instead.
self.scaler = torch.cuda.amp.GradScaler()
#############################################

Running job: flux_train_replicate

#############################################
Running 1 process
Loading Flux model
Loading transformer
Error running job: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.

Result:

0 completed jobs
1 failure
========================================
Traceback (most recent call last):
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py”, line 304, in hf_raise_for_status
response.raise_for_status()
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/requests/models.py”, line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1751, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1673, in get_hf_file_metadata
r = _request_wrapper(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 376, in _request_wrapper
response = _request_wrapper(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 400, in _request_wrapper
hf_raise_for_status(response)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py”, line 367, in hf_raise_for_status
raise HfHubHTTPError(message, response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: (Request ID: Root=1-66f67aeb-2e5d11046011068a4df3d290;aa99d9fa-3cee-45ea-a1e0-3eeb3edb7e68)
403 Forbidden: Please enable access to public gated repositories in your fine-grained token settings to view this repository…
Cannot access content at: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/transformer/config.json.
If you are trying to create or update content, make sure you have a token with the write role.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/configuration_utils.py”, line 379, in load_config
config_file = hf_hub_download(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py”, line 101, in inner_f
return f(*args, **kwargs)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1240, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1347, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/file_download.py”, line 1857, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/src/run.py”, line 90, in
main()
File “/src/run.py”, line 86, in main
raise e
File “/src/run.py”, line 78, in main
job.run()
File “/src/jobs/ExtensionJob.py”, line 22, in run
process.run()
File “/src/jobs/process/BaseSDTrainProcess.py”, line 1230, in run
self.sd.load_model()
File “/src/toolkit/stable_diffusion_model.py”, line 469, in load_model
transformer = FluxTransformer2DModel.from_pretrained(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/models/modeling_utils.py”, line 612, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py”, line 114, in _inner_fn
return fn(*args, **kwargs)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/diffusers/configuration_utils.py”, line 406, in load_config
raise EnvironmentError(
OSError: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.
Traceback (most recent call last):
File “/root/.pyenv/versions/3.10.14/lib/python3.10/site-packages/cog/server/worker.py”, line 349, in _predict
result = predict(**payload)
File “/src/train.py”, line 74, in train
subprocess.check_call([“python”, “run.py”, “config/replicate.yml”], close_fds=False)
File “/root/.pyenv/versions/3.10.14/lib/python3.10/subprocess.py”, line 369, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command ‘[‘python’, ‘run.py’, ‘config/replicate.yml’]’ returned non-zero exit status 1.

guycas · September 29, 2024, 4:13am

how did you fix this?

ikoolo · October 12, 2024, 6:33pm

Same here. Absolute beginner too, help is gratefully accepted

John6666 · October 12, 2024, 10:23pm

Flux dev is a gated model, so you’ll have to get permission from the following page to use it.

Topic		Replies	Views
Fatal: Authentication failed Beginners	0	1042	October 18, 2022
Huggingface token returning an invalid token Intermediate	1	1340	May 17, 2024
ValueError: You are trying to offload the whole model to the disk. Please use the `disk_offload` function instead Beginners	6	7617	January 1, 2024
After autotrain and push the files to the repo, there is no config file Beginners	3	1234	January 17, 2024
Sagemaker Huggingface PermissionError: [Errno 13] Permission denied: 'git' Beginners	0	788	October 12, 2023

Training Flux Lora Failed

Running job: flux_train_replicate

############################################# Running 1 process Loading Flux model Loading transformer Error running job: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.

Related topics

#############################################
Running 1 process
Loading Flux model
Loading transformer
Error running job: black-forest-labs/FLUX.1-dev does not appear to have a file named config.json.