I think the problem might be that you call optimizer.zero_grad()
after outputs are calculated, and it zeros out the gradients from the forward pass. Try putting that line before the line where outputs are calculated.
I think the problem might be that you call optimizer.zero_grad()
after outputs are calculated, and it zeros out the gradients from the forward pass. Try putting that line before the line where outputs are calculated.