Training RNN: backward in each stage or one time after all

Hi everyone,
I have a question about training RNN. Do we need to do backward() and optimizer.step() in each recurrent stage or do a one time backward() and optimizer.step() together ? Or do multiple backward() but one time optimizer.step() ?

All of them seems to work but which is more rational ?


Usually the second. You finish running on the whole sequence, and run the backward and update the parameter.

1 Like