while the comments in the tutorial specify that autograd is used, it is never explicitly declared (that I can see). In supervised learning, the inputs are usually set as input_data = Variable(input_data) and then out = net.forward(data). However, here, Variable is never used. I do see that the loss tensor contains a gradient - but I am not sure where this came from.
Another observation, if I set
state_action_values = Variable(state_action_values,requires_grad=True)
then the code will not run - throwing an error on:
for param in policy_net.parameters():
saying that ‘NoneType’ has no attribute data (where as clearly before adding the Variable code it did…)
Which PyTorch version are you using?
In 0.4.0Variables and tensors were merged, so that you don’t have to wrap your tensors anymore.
If you are still used to Variables, the migration guide might help.
Also, the current release is 0.4.1. Make sure to update to this version.
I haven’t explored the tutorial in detail, but from what I know state_action_values are the output of the model, and should already require gradients.
Could you check it with state_action_values.requires_grad?
Also, if you re-wrap a Tensor, it will lose it’s associated computation graph and you are thus detaching it.
That’s the reason, why .grad is empty in the example you’ve posted.
Yes,state_action_values are indeed the output of the policy_net, which has the input state_batch. I checked and state_action_values.requires_grad=True even though it was never explicitly written in code. (I guess this is by default when passing tensors through models, unless with torch.no_grad() - right?)
Yes, your example is clear. I see, so because the module Parameters inside the network are automatically set to requires_grad, then everything that goes through it gets a grad.
I just modified the DQN to load in a pre-trained network. It looks like it was trained versions ago (and used Variable). Now - I run into the same problem as before
I guess this is because of the fact that the stored model dictionary had a Variable in it right (their module defined Variable within the network itself)? This means that I can’t use older pre-trained networks on torch 0.4+?
OK, I performed network surgery, redefined the network and only loaded the state_dict from the modules that existed (removing the method that had a variable function). Still no dice. I added requires_grad=True to the tensors, but still there are parameters in the network without a gradient. Not sure what is going on, but it feels like there might be something wrong. (note that everything is OK when I am training from scratch, I can load models back in, etc).