Setting max_steps with IterableDataset still errors

Hi, I can’t figure out how to train using an IterableDataset. I keep running errors with the train step. If I don’t set the max_steps training arg, I get an error claiming that the dataset doesn’t have a length. On the other hand, with that set to max_steps=150_000 I see this error.

Any pointers?

There seems to be not a single sample in your epoch_iterator, stopping training at step 0! This is expected if you're using an IterableDataset and set num_steps (1500000) higher than the number of available samples.
{'eval_runtime': 3.9334, 'eval_samples_per_second': 0.0, 'eval_steps_per_second': 0.0, 'epoch': 0}
  0%|                                                                                                                                                             | 0/1500000 [00:07<?, ?it/s]Traceback (most recent call last):
    train_loss = self._total_loss_scalar / self.state.global_step
ZeroDivisionError: float division by zero

Hi, I am also facing the same problem as yours

Have you fix this error yet? If you’ve already fixed it, can you help me out please?

I am also facing the exact same error and there aren’t many resources on the internet that I am finding for this, could you please share what you did to solve it? I would really appreciate the help!

I encountered the same issue. Previously, I set per_device_train_batch_size=1 in TrainingArguments, but I found that if I comment out this line of code, the problem is resolved, and the trainer runs normally. Does anyone have any idea why this is happening?

I also encountered this error. The reason for the error was that I should load data in jsonl format, but I loaded it in json format, which resulted in no data, and then this error occurred.