So in order to calculate values of Q(s,a) and maxQ’(s’,a’), we need to do to a forward pass on the model two times. If I understood correctly this would create two different computation graphs for each forward pass.
So my question is whether I “have to” detach() the resulting value of maxQ’(s’,a’) before doing the backward pass. Does it lead to errors if I don’t, and why?
However, my question is mostly about if I have to do that, and whether it would produce errors if I don’t, and why. (in a situation where we have x and y, both resulting from forward passes and both have gradients and now we want to backprop loss(x, y))
You are welcome, yes you have to do that either by using ‘with torch.no_grad()’ or ‘.detach()’, otherwise your gradient is not accumulated correctly with respect to the loss.