Crash during training

Hi, what i am doing wrong?
Paid account…

special_tokens_map.json: 100%|██████████| 1.14k/1.14k [00:00<00:00, 7.42MB/s]
  0%|          | 0/204 [00:00<?, ?it/s]❌ ERROR  | 2023-12-19 07:24:56 | autotrain.trainers.common:wrapper:79 - train has failed due to an exception: Traceback (most recent call last):
  File "/app/src/autotrain/trainers/common.py", line 76, in wrapper
    return func(*args, **kwargs)
  File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 216, in train
    trainer.train()
  File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train
    return inner_training_loop(
  File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1821, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/app/env/lib/python3.10/site-packages/accelerate/data_loader.py", line 448, in __iter__
    current_batch = next(dataloader_iter)
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/app/src/autotrain/trainers/seq2seq/dataset.py", line 18, in __getitem__
    labels = self.tokenizer(text_target=target, max_length=self.max_len_target, truncation=True)
  File "/app/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2804, in __call__
    self._switch_to_target_mode()
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 361, in _switch_to_target_mode
    self.set_tgt_lang_special_tokens(self.tgt_lang)
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 372, in set_tgt_lang_special_tokens
    lang_token = self.get_lang_token(tgt_lang)
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 378, in get_lang_token
    return self.lang_code_to_token[lang]
KeyError: None

❌ ERROR  | 2023-12-19 07:24:56 | autotrain.trainers.common:wrapper:80 - None
🚀 INFO   | 2023-12-19 07:24:56 | autotrain.trainers.common:pause_space:44 - Pausing space...
Traceback (most recent call last):
  File "/app/src/autotrain/trainers/common.py", line 76, in wrapper
    return func(*args, **kwargs)
  File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 216, in train
    trainer.train()
  File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train
    return inner_training_loop(
  File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1821, in _inner_training_loop
    for step, inputs in enumerate(epoch_iterator):
  File "/app/env/lib/python3.10/site-packages/accelerate/data_loader.py", line 448, in __iter__
    current_batch = next(dataloader_iter)
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/app/src/autotrain/trainers/seq2seq/dataset.py", line 18, in __getitem__
    labels = self.tokenizer(text_target=target, max_length=self.max_len_target, truncation=True)
  File "/app/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2804, in __call__
    self._switch_to_target_mode()
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 361, in _switch_to_target_mode
    self.set_tgt_lang_special_tokens(self.tgt_lang)
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 372, in set_tgt_lang_special_tokens
    lang_token = self.get_lang_token(tgt_lang)
  File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 378, in get_lang_token
    return self.lang_code_to_token[lang]
KeyError: None

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
    response.raise_for_status()
  File "/app/env/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/spaces/emilzak/autotrain-date_parser_v9-0/discussions

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/app/env/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/app/env/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 248, in <module>
    train(config)
  File "/app/src/autotrain/trainers/common.py", line 81, in wrapper
    pause_space(config, is_failure=True)
  File "/app/src/autotrain/trainers/common.py", line 55, in pause_space
    api.create_discussion(
  File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/app/env/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 5126, in create_discussion
    hf_raise_for_status(resp)
  File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 330, in hf_raise_for_status
    raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/spaces/emilzak/autotrain-date_parser_v9-0/discussions (Request ID: Root=1-65814548-7eb4e49b090b61b259484c16;1a13be60-f921-4401-ba02-7eb7781263d5)

Oops ** You've been rate limited. For safety reasons, we limit the number of write operations for new users. Please try again in 24 hours or get in touch with us at website@huggingface.co if you need access now.
Oops 😱 You've been rate limited. For safety reasons, we limit the number of write operations for new users. Please try again in 24 hours or get in touch with us at website@huggingface.co if you need access now.
  0%|          | 0/204 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/app/env/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/app/env/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
    args.func(args)
  File "/app/env/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
    simple_launcher(args)
  File "/app/env/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/app/env/bin/python', '-m', 'autotrain.trainers.seq2seq', '--training_config', '/tmp/model/training_params.json']' returned non-zero exit status 1.
> INFO    Process 41 is already completed. Skipping...
> INFO    No running jobs found. Shutting down the server.
INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [35]

Logs
App
Files
Community
Settings


Same error. Tried autotrain / Dreambooth lora xl

Did you find solution?

hi @emilzak, just to double check, I see you’re a Pro user, are you still having issues after subscribing as Pro? We do have rate-limits different for Free accounts vs Pro

Yes, the support solved my issue. 10x!

1 Like