I’m trying to create a version of A3C reinforcement learning in caffe and have a question on how ‘loss.backward()’ works.
In caffe (for which I am familiar with), the loss value is calculated, placed in a variable such as a ‘double’ or ‘float’ and then used to calculate the gradient diffs within each layer’s backward function all of which are then applied by the optimizer used which later applies the diffs to the various learnable blobs based on the learning rate, decay etc.
How does this work in pytorch? For example, I see in numerous A3C examples how the loss is calculated, but what occurs when the call to ‘loss.backward()’ is made?
Does the ‘loss.backward()’ function apply the loss ‘value’ to each layer within the model defined much in the same way that caffe does or is there an extra bit of magic that I’m missing?
Any comments are appreciated.
Thanks!