Is there any difference with these 2 methods for accumulate gradients??
- accumulate with averaged loss
accum_loss = 0
for _ in range(10):
out = model(x)
loss = get_loss(out, y)
accum_loss += loss
optimizer.zero_grad()
accum_loss /= 10
accum_loss.backward(retain_graph=True)
optimizer.step()
- accumulate with autograd’s bakward function.
optimizer.zero_grad()
for _ in range(10):
out = model(x)
loss = get_loss(out, y)
loss.backward()
optimizer.step()
help me please…