Prefetch factor issue

Hey folks,

I am training a BERT model and it was working fine till 2 days back but I am suddenly getting this error-

Detected kernel version 4.14.336, which is below the recommended minimum of 5.5.0; this can cause the process to hang. It is recommended to upgrade the kernel to the minimum version or higher.

ValueError Traceback (most recent call last)
Cell In[22], line 23
1 training_args = TrainingArguments(
2 output_dir=“output”,
3 learning_rate=2e-5,
(…)
10 load_best_model_at_end=True
11 )
13 trainer = Trainer(
14 model=model,
15 args=training_args,
(…)
20 compute_metrics=compute_metrics
21 )
—> 23 trainer.train()

File /opt/conda/lib/python3.8/site-packages/transformers/trainer.py:1624, in Trainer.train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1622 hf_hub_utils.enable_progress_bars()
1623 else:
→ 1624 return inner_training_loop(
1625 args=args,
1626 resume_from_checkpoint=resume_from_checkpoint,
1627 trial=trial,
1628 ignore_keys_for_eval=ignore_keys_for_eval,
1629 )

File /opt/conda/lib/python3.8/site-packages/transformers/trainer.py:1653, in Trainer._inner_training_loop(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)
1651 logger.debug(f"Currently training with a batch size of: {self._train_batch_size}")
1652 # Data loader and number of training steps
→ 1653 train_dataloader = self.get_train_dataloader()
1654 if self.is_fsdp_xla_v2_enabled:
1655 train_dataloader = tpu_spmd_dataloader(train_dataloader)

File /opt/conda/lib/python3.8/site-packages/transformers/trainer.py:852, in Trainer.get_train_dataloader(self)
849 dataloader_params[“worker_init_fn”] = seed_worker
850 dataloader_params[“prefetch_factor”] = self.args.dataloader_prefetch_factor
→ 852 return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params))

File /opt/conda/lib/python3.8/site-packages/torch/utils/data/dataloader.py:183, in DataLoader.init(self, dataset, batch_size, shuffle, sampler, batch_sampler, num_workers, collate_fn, pin_memory, drop_last, timeout, worker_init_fn, multiprocessing_context, generator, prefetch_factor, persistent_workers)
180 raise ValueError(‘timeout option should be non-negative’)
182 if num_workers == 0 and prefetch_factor != 2:
→ 183 raise ValueError(‘prefetch_factor option could only be specified in multiprocessing.’
184 ‘let num_workers > 0 to enable multiprocessing.’)
185 assert prefetch_factor > 0
187 if persistent_workers and num_workers == 0:

ValueError: prefetch_factor option could only be specified in multiprocessing.let num_workers > 0 to enable multiprocessing.

Can someone help me with this?

I am installing the below packages-

! pip install langdetect
! pip install transformers[torch]
! pip install accelerate -U
! pip install transformers
! pip install datasets
! pip install seqeval
! pip install evaluate
! conda install -n base -c conda-forge -y ipywidgets
! pip install transformers --upgrade
%pip install ‘snowflake-connector-python[pandas]’
!pip install keyring==23.10.0

please check with dependencies in virtual environment and restart the system these are the conflicts

@jeevisha30 I am facing the same issue, were you able to find the solution? In my case this is happening with the BEiT: BERT Pre-Training of Image Transformers model.

What version of torch are you using? <=1.13.1?

I am seeing a regression with 4.38.* and torch<=1.13.1.

dataloader_prefetch_factor was added to TrainingArguments 2 months ago with the default value None: Blaming transformers/src/transformers/training_args.py at e9476832942a19cf99354776ef112babc83c139a · huggingface/transformers · GitHub

But old versions of torch do not accept None and will raise an error if num_workers == 0 and prefetch_factor != 2: pytorch/torch/utils/data/dataloader.py at 49444c3e546bf240bed24a101e747422d1f8a0ee · pytorch/pytorch · GitHub

I am using torch version 1.13.1+cu116

1 Like