accelerator.accumulate(model)
loss = loss.mean()
# change above from here
accelerator.backward(loss)
optimizer.step()
scheduler.step()
model.zero_grad()
global_step += 1
log_steps += 1
optimizer.zero_grad()
# change above from here
is this correctly? Thank you in advance