Gradient accumulation: should I duplicate data?

Yes, would be helpful to have an update from you @patrickvonplaten