Training RNN: backward in each stage or one time after all

ywu36 · September 26, 2017, 6:45pm

Hi everyone,
I have a question about training RNN. Do we need to do backward() and optimizer.step() in each recurrent stage or do a one time backward() and optimizer.step() together ? Or do multiple backward() but one time optimizer.step() ?

All of them seems to work but which is more rational ?

Thanks,
Regards,
Yuhang

ruotianluo · September 26, 2017, 9:53pm

Usually the second. You finish running on the whole sequence, and run the backward and update the parameter.