What am I doing wrong?
==========
== CUDA ==
==========
CUDA Version 12.1.1
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
> INFO AUTOTRAIN_USERNAME: emilzak
> INFO PROJECT_NAME: date_parser_v9-0
> INFO TASK_ID: 28
> INFO DATA_PATH: emilzak/autotrain-data-date_parser_v9
> INFO MODEL: facebook/m2m100_418M
> INFO OUTPUT_MODEL_REPO: emilzak/date_parser_v9-0
INFO: Started server process [34]
INFO: Waiting for application startup.
> INFO {'data_path': 'emilzak/autotrain-data-date_parser_v9', 'model': 'facebook/m2m100_418M', 'username': 'emilzak', 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'project_name': 'date_parser_v9-0', 'token': 'hf_**********************************', 'push_to_hub': True, 'text_column': 'autotrain_text', 'target_column': 'autotrain_label', 'repo_id': 'emilzak/date_parser_v9-0', 'lr': 5e-05, 'epochs': 3, 'max_seq_length': 128, 'max_target_length': 128, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'logging_steps': -1, 'evaluation_strategy': 'epoch', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'save_strategy': 'epoch', 'peft': False, 'quantization': None, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'target_modules': []}
> INFO ['accelerate', 'launch', '--num_machines', '1', '--num_processes', '1', '--mixed_precision', 'fp16', '-m', 'autotrain.trainers.seq2seq', '--training_config', '/tmp/model/training_params.json']
> INFO Started training with PID 40
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
The following values were not passed to `accelerate launch` and had defaults used instead:
`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
Downloading builder script: 0%| | 0.00/6.27k [00:00<?, ?B/s]
Downloading builder script: 100%|██████████| 6.27k/6.27k [00:00<00:00, 21.9MB/s]
🚀 INFO | 2023-12-19 07:49:35 | __main__:train:47 - Starting training...
🚀 INFO | 2023-12-19 07:49:35 | __main__:train:48 - Training config: {'data_path': 'emilzak/autotrain-data-date_parser_v9', 'model': 'facebook/m2m100_418M', 'username': 'emilzak', 'seed': 42, 'train_split': 'train', 'valid_split': 'validation', 'project_name': '/tmp/model', 'token': '*****', 'push_to_hub': True, 'text_column': 'autotrain_text', 'target_column': 'autotrain_label', 'repo_id': 'emilzak/date_parser_v9-0', 'lr': 5e-05, 'epochs': 3, 'max_seq_length': 128, 'max_target_length': 128, 'batch_size': 8, 'warmup_ratio': 0.1, 'gradient_accumulation': 1, 'optimizer': 'adamw_torch', 'scheduler': 'linear', 'weight_decay': 0.0, 'max_grad_norm': 1.0, 'logging_steps': -1, 'evaluation_strategy': 'epoch', 'auto_find_batch_size': False, 'mixed_precision': 'fp16', 'save_total_limit': 1, 'save_strategy': 'epoch', 'peft': False, 'quantization': None, 'lora_r': 16, 'lora_alpha': 32, 'lora_dropout': 0.05, 'target_modules': []}
Downloading readme: 0%| | 0.00/617 [00:00<?, ?B/s]
Downloading readme: 100%|██████████| 617/617 [00:00<00:00, 6.19MB/s]
Downloading data files: 0%| | 0/2 [00:00<?, ?it/s]
Downloading data: 0%| | 0.00/14.7k [00:00<?, ?B/s]
Downloading data: 100%|██████████| 14.7k/14.7k [00:00<00:00, 84.3kB/s]
Downloading data: 100%|██████████| 14.7k/14.7k [00:00<00:00, 84.2kB/s]
Downloading data files: 50%|█████ | 1/2 [00:00<00:00, 5.67it/s]
Downloading data: 0%| | 0.00/5.16k [00:00<?, ?B/s]
Downloading data: 100%|██████████| 5.16k/5.16k [00:00<00:00, 98.2kB/s]
Downloading data files: 100%|██████████| 2/2 [00:00<00:00, 8.65it/s]
Extracting data files: 0%| | 0/2 [00:00<?, ?it/s]
Extracting data files: 100%|██████████| 2/2 [00:00<00:00, 1665.73it/s]
Generating train split: 0%| | 0/544 [00:00<?, ? examples/s]
Generating train split: 100%|██████████| 544/544 [00:00<00:00, 126634.55 examples/s]
Generating validation split: 0%| | 0/136 [00:00<?, ? examples/s]
Generating validation split: 100%|██████████| 136/136 [00:00<00:00, 98860.54 examples/s]
config.json: 0%| | 0.00/908 [00:00<?, ?B/s]
config.json: 100%|██████████| 908/908 [00:00<00:00, 7.51MB/s]
pytorch_model.bin: 0%| | 0.00/1.94G [00:00<?, ?B/s]
pytorch_model.bin: 1%| | 10.5M/1.94G [00:01<03:26, 9.34MB/s]
pytorch_model.bin: 3%|▎ | 62.9M/1.94G [00:01<00:28, 65.4MB/s]
pytorch_model.bin: 5%|▍ | 94.4M/1.94G [00:02<00:57, 31.8MB/s]
pytorch_model.bin: 6%|▌ | 115M/1.94G [00:03<00:44, 40.7MB/s]
pytorch_model.bin: 7%|▋ | 136M/1.94G [00:03<00:34, 51.9MB/s]
pytorch_model.bin: 15%|█▍ | 283M/1.94G [00:03<00:09, 184MB/s]
pytorch_model.bin: 20%|██ | 388M/1.94G [00:03<00:05, 275MB/s]
pytorch_model.bin: 26%|██▌ | 503M/1.94G [00:03<00:03, 397MB/s]
pytorch_model.bin: 33%|███▎ | 629M/1.94G [00:03<00:02, 513MB/s]
pytorch_model.bin: 37%|███▋ | 724M/1.94G [00:04<00:04, 287MB/s]
pytorch_model.bin: 41%|████ | 786M/1.94G [00:04<00:04, 278MB/s]
pytorch_model.bin: 44%|████▍ | 849M/1.94G [00:04<00:03, 320MB/s]
pytorch_model.bin: 54%|█████▎ | 1.04G/1.94G [00:04<00:01, 547MB/s]
pytorch_model.bin: 59%|█████▊ | 1.13G/1.94G [00:05<00:01, 526MB/s]
pytorch_model.bin: 63%|██████▎ | 1.22G/1.94G [00:05<00:01, 494MB/s]
pytorch_model.bin: 75%|███████▌ | 1.46G/1.94G [00:05<00:00, 820MB/s]
pytorch_model.bin: 93%|█████████▎| 1.81G/1.94G [00:05<00:00, 1.32GB/s]
pytorch_model.bin: 100%|█████████▉| 1.94G/1.94G [00:05<00:00, 350MB/s]
generation_config.json: 0%| | 0.00/233 [00:00<?, ?B/s]
generation_config.json: 100%|██████████| 233/233 [00:00<00:00, 1.79MB/s]
tokenizer_config.json: 0%| | 0.00/272 [00:00<?, ?B/s]
tokenizer_config.json: 100%|██████████| 272/272 [00:00<00:00, 2.03MB/s]
vocab.json: 0%| | 0.00/3.71M [00:00<?, ?B/s]
vocab.json: 100%|██████████| 3.71M/3.71M [00:00<00:00, 20.9MB/s]
vocab.json: 100%|██████████| 3.71M/3.71M [00:00<00:00, 20.7MB/s]
sentencepiece.bpe.model: 0%| | 0.00/2.42M [00:00<?, ?B/s]
sentencepiece.bpe.model: 100%|██████████| 2.42M/2.42M [00:00<00:00, 31.4MB/s]
special_tokens_map.json: 0%| | 0.00/1.14k [00:00<?, ?B/s]
special_tokens_map.json: 100%|██████████| 1.14k/1.14k [00:00<00:00, 9.90MB/s]
0%| | 0/204 [00:00<?, ?it/s]❌ ERROR | 2023-12-19 07:49:49 | autotrain.trainers.common:wrapper:79 - train has failed due to an exception: Traceback (most recent call last):
File "/app/src/autotrain/trainers/common.py", line 76, in wrapper
return func(*args, **kwargs)
File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 216, in train
trainer.train()
File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1821, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/app/env/lib/python3.10/site-packages/accelerate/data_loader.py", line 448, in __iter__
current_batch = next(dataloader_iter)
File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/app/src/autotrain/trainers/seq2seq/dataset.py", line 18, in __getitem__
labels = self.tokenizer(text_target=target, max_length=self.max_len_target, truncation=True)
File "/app/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2804, in __call__
self._switch_to_target_mode()
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 361, in _switch_to_target_mode
self.set_tgt_lang_special_tokens(self.tgt_lang)
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 372, in set_tgt_lang_special_tokens
lang_token = self.get_lang_token(tgt_lang)
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 378, in get_lang_token
return self.lang_code_to_token[lang]
KeyError: None
❌ ERROR | 2023-12-19 07:49:49 | autotrain.trainers.common:wrapper:80 - None
🚀 INFO | 2023-12-19 07:49:49 | autotrain.trainers.common:pause_space:44 - Pausing space...
Traceback (most recent call last):
File "/app/src/autotrain/trainers/common.py", line 76, in wrapper
return func(*args, **kwargs)
File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 216, in train
trainer.train()
File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/app/env/lib/python3.10/site-packages/transformers/trainer.py", line 1821, in _inner_training_loop
for step, inputs in enumerate(epoch_iterator):
File "/app/env/lib/python3.10/site-packages/accelerate/data_loader.py", line 448, in __iter__
current_batch = next(dataloader_iter)
File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/app/env/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 674, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/app/env/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/app/src/autotrain/trainers/seq2seq/dataset.py", line 18, in __getitem__
labels = self.tokenizer(text_target=target, max_length=self.max_len_target, truncation=True)
File "/app/env/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 2804, in __call__
self._switch_to_target_mode()
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 361, in _switch_to_target_mode
self.set_tgt_lang_special_tokens(self.tgt_lang)
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 372, in set_tgt_lang_special_tokens
lang_token = self.get_lang_token(tgt_lang)
File "/app/env/lib/python3.10/site-packages/transformers/models/m2m_100/tokenization_m2m_100.py", line 378, in get_lang_token
return self.lang_code_to_token[lang]
KeyError: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 270, in hf_raise_for_status
response.raise_for_status()
File "/app/env/lib/python3.10/site-packages/requests/models.py", line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/spaces/emilzak/autotrain-date_parser_v9-0/discussions
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/app/env/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/app/env/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/app/src/autotrain/trainers/seq2seq/__main__.py", line 248, in <module>
train(config)
File "/app/src/autotrain/trainers/common.py", line 81, in wrapper
pause_space(config, is_failure=True)
File "/app/src/autotrain/trainers/common.py", line 55, in pause_space
api.create_discussion(
File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/app/env/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 5126, in create_discussion
hf_raise_for_status(resp)
File "/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 330, in hf_raise_for_status
raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/spaces/emilzak/autotrain-date_parser_v9-0/discussions (Request ID: Root=1-65814b1d-7783c5e03a0006782cf21ce8;91d498ce-240b-485b-b65a-3b52ae9b3735)
Oops ** You've been rate limited. For safety reasons, we limit the number of write operations for new users. Please try again in 24 hours or get in touch with us at website@huggingface.co if you need access now.
Oops 😱 You've been rate limited. For safety reasons, we limit the number of write operations for new users. Please try again in 24 hours or get in touch with us at website@huggingface.co if you need access now.
0%| | 0/204 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/app/env/bin/accelerate", line 8, in <module>
sys.exit(main())
File "/app/env/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/app/env/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
simple_launcher(args)
File "/app/env/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/app/env/bin/python', '-m', 'autotrain.trainers.seq2seq', '--training_config', '/tmp/model/training_params.json']' returned non-zero exit status 1.
> INFO Process 40 is already completed. Skipping...
> INFO No running jobs found. Shutting down the server.
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [34]