Will gradient accumulate with multiple loss in RNNs?

zhangyuygss · October 10, 2018, 3:21am

Say I have a RNN like model, which have a loss at every step:

for step in step_cnts:
    output = rnn(input, hidden)
    loss = loss + criteria(output, target)

When backward with loss.backward(), will gradient accumulate at every step of the RNN?
And won’t this accumulating leads to gradient explode?

By the way, if I average the loss with loss = loss / step_cnts, will the gradient be different from
the sum version?

SimonW · October 10, 2018, 5:38am

When backward with loss.backward() , will gradient accumulate at every step of the RNN?

yes

And won’t this accumulating leads to gradient explode?

depending on your arch, data, etc.

will the gradient be different from
the sum version?

yes because you scaled the loss