AUTOTRAIN NOT working at all

I’ve tried to fine tune multiple models using many different datasets and once i click the start training button it turns red for a couple of seconds then turns blue again, I’ve tried this with multiple models and different datasets but nothing works, I’ve included the log file below

Blockquote

Device 0: Tesla T4 - 434.4MiB/15360MiB


INFO | 2024-06-27 09:25:29 | autotrain.app.utils:kill_process_by_pid:52 - Sent SIGTERM to process with PID 69

INFO | 2024-06-27 09:25:29 | autotrain.app.utils:get_running_jobs:26 - Killing PID: 69

ERROR | 2024-06-27 09:25:27 | autotrain.trainers.common:wrapper:121 - Error occurred while packing the dataset. Make sure that your dataset has enough samples to at least yield one packed sequence.

ValueError: Error occurred while packing the dataset. Make sure that your dataset has enough samples to at least yield one packed sequence.

raise ValueError(

File “/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 627, in _prepare_packed_dataloader

return self._prepare_packed_dataloader(

File “/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 519, in _prepare_dataset

train_dataset = self._prepare_dataset(

File “/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 362, in init

return f(*args, **kwargs)

File “/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py”, line 101, in inner_f

trainer = SFTTrainer(

File “/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/train_clm_sft.py”, line 44, in train

train_sft(config)

File “/app/env/lib/python3.10/site-packages/autotrain/trainers/clm/main.py”, line 28, in train

return func(*args, **kwargs)

File “/app/env/lib/python3.10/site-packages/autotrain/trainers/common.py”, line 117, in wrapper

Traceback (most recent call last):

The above exception was the direct cause of the following exception:

datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset

raise DatasetGenerationError(“An error occurred while generating the dataset”) from e

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1784, in _prepare_split_single

for job_id, done, content in self._prepare_split_single(

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1627, in _prepare_split

self._prepare_split(split_generator, **prepare_split_kwargs)

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1122, in _download_and_prepare

super()._download_and_prepare(

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1789, in _download_and_prepare

self._download_and_prepare(

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1027, in download_and_prepare

self.builder.download_and_prepare(

File “/app/env/lib/python3.10/site-packages/datasets/io/generator.py”, line 47, in read

).read()

File “/app/env/lib/python3.10/site-packages/datasets/arrow_dataset.py”, line 1125, in from_generator

packed_dataset = Dataset.from_generator(

File “/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 623, in _prepare_packed_dataloader

Traceback (most recent call last):

The above exception was the direct cause of the following exception:

KeyError: ‘text’

self.formatting_func = lambda x: x[dataset_text_field]

File “/app/env/lib/python3.10/site-packages/trl/trainer/utils.py”, line 480, in

buffer.append(self.formatting_func(next(iterator)))

File “/app/env/lib/python3.10/site-packages/trl/trainer/utils.py”, line 503, in iter

yield from constant_length_iterator

File “/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py”, line 620, in data_generator

for idx, ex in enumerate(self.config.generator(**gen_kwargs)):

File “/app/env/lib/python3.10/site-packages/datasets/packaged_modules/generator/generator.py”, line 30, in _generate_examples

for key, record in generator:

File “/app/env/lib/python3.10/site-packages/datasets/builder.py”, line 1748, in _prepare_split_single

ERROR | 2024-06-27 09:25:27 | autotrain.trainers.common:wrapper:120 - train has failed due to an exception: Traceback (most recent call last):

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

warnings.warn(

/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py:307: UserWarning: You passed a dataset_text_field argument to the SFTTrainer, the value you passed will override the one in the SFTConfig.

warnings.warn(

/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py:269: UserWarning: You passed a max_seq_length argument to the SFTTrainer, the value you passed will override the one in the SFTConfig.

warnings.warn(

/app/env/lib/python3.10/site-packages/transformers/training_args.py:1965: FutureWarning: --push_to_hub_token is deprecated and will be removed in version 5 of :hugs: Transformers. Use --hub_token instead.

warnings.warn(

/app/env/lib/python3.10/site-packages/trl/trainer/sft_trainer.py:181: UserWarning: You passed a packing argument to the SFTTrainer, the value you passed will override the one in the SFTConfig.

warnings.warn(

/app/env/lib/python3.10/site-packages/transformers/training_args.py:1965: FutureWarning: --push_to_hub_token is deprecated and will be removed in version 5 of :hugs: Transformers. Use --hub_token instead.

warnings.warn(message, FutureWarning)

Deprecated positional argument(s) used in SFTTrainer, please use the SFTConfig to set these arguments instead.

/app/env/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py:100: FutureWarning: Deprecated argument(s) used in ‘init’: dataset_text_field, max_seq_length, packing. Will not be supported from version ‘1.0.0’.

INFO | 2024-06-27 09:25:27 | autotrain.trainers.clm.train_clm_sft:train:37 - creating trainer

INFO | 2024-06-27 09:25:27 | autotrain.trainers.clm.utils:get_model:666 - model dtype: torch.float16

Loading checkpoint shards: 100%|██████████| 2/2 [00:13<00:00, 6.98s/it]

Loading checkpoint shards: 100%|██████████| 2/2 [00:13<00:00, 6.48s/it]

Downloading shards: 100%|██████████| 2/2 [00:33<00:00, 16.96s/it]

Downloading shards: 100%|██████████| 2/2 [00:33<00:00, 16.66s/it]

low_cpu_mem_usage was None, now set to True since model is quantized.

. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.

  • modeling_phi3.py

A new version of the following files was downloaded from https://huggingface.co/microsoft/Phi-3-mini-4k-instruct:

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:get_model:635 - loading model…

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:get_model:627 - loading model config…

WARNING | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:get_model:625 - Unsloth not available, continuing without it…

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:get_model:583 - Can use unsloth: False

. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.

  • configuration_phi3.py

A new version of the following files was downloaded from
huggingface.co/microsoft/Phi-3-mini-4k-instruct:

warnings.warn(

/app/env/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:configure_block_size:548 - Using block size 1024

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:configure_training_args:485 - configuring training args

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:configure_logging_steps:480 - Logging steps: 25

INFO | 2024-06-27 09:24:38 | autotrain.trainers.clm.utils:configure_logging_steps:467 - configuring logging steps

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.

INFO | 2024-06-27 09:24:37 | autotrain.trainers.clm.utils:process_input_data:395 - Valid data: None

})

num_rows: 1110

features: [‘question’, ‘answer’],

INFO | 2024-06-27 09:24:37 | autotrain.trainers.clm.utils:process_input_data:394 - Train data: Dataset({

Generating train split: 100%|██████████| 1110/1110 [00:00<00:00, 93188.10 examples/s]

Generating train split: 0%| | 0/1110 [00:00<?, ? examples/s]

Downloading data: 100%|██████████| 600k/600k [00:00<00:00, 20.6MB/s]

Downloading data: 0%| | 0.00/600k [00:00<?, ?B/s]

Downloading readme: 100%|██████████| 139/139 [00:00<00:00, 839kB/s]

Downloading readme: 0%| | 0.00/139 [00:00<?, ?B/s]

INFO | 2024-06-27 09:24:36 | autotrain.trainers.clm.train_clm_sft:train:12 - Starting SFT training…

e[93m [WARNING] e[0m using untested triton version (2.3.0), only 1.0.0 is known to be compatible

e[93m [WARNING] e[0m sparse_attn requires a torch version >= 1.5 and < 2.0 but detected 2.3

e[93m [WARNING] e[0m NVIDIA Inference is only supported on Ampere and newer architectures

e[93m [WARNING] e[0m Please specify the CUTLASS repo directory as environment variable $CUTLASS_PATH

e[93m [WARNING] e[0m If libaio is already installed (perhaps from source), try setting the CFLAGS and LDFLAGS environment variables to where it can be found.

e[93m [WARNING] e[0m async_io: please install the libaio-dev package with apt

e[93m [WARNING] e[0m async_io requires the dev libaio .so object and headers but these were not found.

[2024-06-27 09:24:36,228] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)

To avoid this warning pass in values for each of the problematic parameters or run accelerate config.

--dynamo_backend was set to a value of 'no'

The following values were not passed to accelerate launch and had defaults used instead:

INFO | 2024-06-27 09:24:28 | autotrain.backends.local:create:13 - Training PID: 69

INFO | 2024-06-27 09:24:28 | autotrain.commands:launch_command:401 - {‘model’: ‘microsoft/Phi-3-mini-4k-instruct’, ‘project_name’: ‘autotrain-8v782-tuct3’, ‘data_path’: ‘huggingfacepremium/train’, ‘train_split’: ‘train’, ‘valid_split’: None, ‘add_eos_token’: True, ‘block_size’: 1024, ‘model_max_length’: 2048, ‘padding’: ‘right’, ‘trainer’: ‘sft’, ‘use_flash_attention_2’: False, ‘log’: ‘tensorboard’, ‘disable_gradient_checkpointing’: False, ‘logging_steps’: -1, ‘eval_strategy’: ‘epoch’, ‘save_total_limit’: 1, ‘auto_find_batch_size’: False, ‘mixed_precision’: ‘fp16’, ‘lr’: 3e-05, ‘epochs’: 3, ‘batch_size’: 2, ‘warmup_ratio’: 0.1, ‘gradient_accumulation’: 4, ‘optimizer’: ‘adamw_torch’, ‘scheduler’: ‘linear’, ‘weight_decay’: 0.0, ‘max_grad_norm’: 1.0, ‘seed’: 42, ‘chat_template’: ‘none’, ‘quantization’: ‘int4’, ‘target_modules’: ‘all-linear’, ‘merge_adapter’: False, ‘peft’: True, ‘lora_r’: 16, ‘lora_alpha’: 32, ‘lora_dropout’: 0.05, ‘model_ref’: None, ‘dpo_beta’: 0.1, ‘max_prompt_length’: 128, ‘max_completion_length’: None, ‘prompt_text_column’: ‘prompt’, ‘text_column’: ‘text’, ‘rejected_text_column’: ‘rejected_text’, ‘push_to_hub’: True, ‘username’: ‘huggingfacepremium’, ‘token’: '*

Blockquote

****', ‘unsloth’: False}

Blockquote