A question on detach() in DQN loss


In Deep-Q-Networks we compute the loss as:

loss = Q(s,a) - (r + gamma * maxQ'(s', a'))

So in order to calculate values of Q(s,a) and maxQ’(s’,a’), we need to do to a forward pass on the model two times. If I understood correctly this would create two different computation graphs for each forward pass.

So my question is whether I “have to” detach() the resulting value of maxQ’(s’,a’) before doing the backward pass. Does it lead to errors if I don’t, and why?

You can write your code in this way to avoid the unwanted grad calculation:

Q(s, a) = your_model(s, a)
with torch.no_grad():
    maxQ'(s',a') = your_model(s', a')

Then the forward calculation of maxQ will not accumulate grads on parameters of your model.

Thanks Liu.

However, my question is mostly about if I have to do that, and whether it would produce errors if I don’t, and why. (in a situation where we have x and y, both resulting from forward passes and both have gradients and now we want to backprop loss(x, y))

Any help/pointers/references on this topic is still appreciated.

You are welcome, yes you have to do that either by using ‘with torch.no_grad()’ or ‘.detach()’, otherwise your gradient is not accumulated correctly with respect to the loss.