Yes, your example is clear. I see, so because the module Parameters
inside the network are automatically set to requires_grad, then everything that goes through it gets a grad.
I just modified the DQN to load in a pre-trained network. It looks like it was trained versions ago (and used Variable). Now - I run into the same problem as before
I guess this is because of the fact that the stored model dictionary had a Variable in it right (their module defined Variable within the network itself)? This means that I can’t use older pre-trained networks on torch 0.4+?