You should use a step counter that goes over all the training loop instead of the counter step, so that you will finish your batch of epoch 0 during epoch 1 (unless your dataset is pretty small, the probablity of having the same samples twice is not super high).