When using DeepSpeed why do I need to pass dataloaders to the `accelerator.prepare`?

Hello @aps, yes, you are correct. The logic behind the current setup is that the conventional training would involve preparing dataloaders and we fill relevant DeepSpeed config params from it. As for the use case you have described, the current workaround would be to pass a dummy dataloader with batch_size filled in which should mimic just passing the batch_size arg directly to prepare call.

A cleaner approach would be to skip this part if train_micro_batch_size_per_gpu is provided in config_file when using DEEPSPEED_CONFIG_FILE support. Let me know if that would solve the issue. If so, please raise a feature Request on repo.

2 Likes