I have the same issue. The data seems to be distributed across multiple nodes, but the total training time does not decrease.
I have the same issue. The data seems to be distributed across multiple nodes, but the total training time does not decrease.