I am trying to retrieve the gradients of the output variables wrt the input variables of a Neural Network model (loaded from a .pt file). Note that the neural network describes a dynamical system in which a history of states is taken into account. The network is too complicated to have an intuition of the range in which the gradients should lie.
I am currently tackling it the following way:
x = torch.zeros(1,1500) # 1500 inputs to model x[:] = torch.from_numpy(array_of_inputs) # Give all input and state data we have to the NN x.requires_grad = True # Make sure gradients can be extracted # Compute outputs out = model(x) out = out alphadot = out # Get gradient of variable alphadot (index in output array is 1) gradient = torch.autograd.grad(outputs=alphadot, inputs=x, grad_outputs=torch.ones_like(alphadot), retain_graph=True) gradient = gradient.tolist() gradient = gradient
The variable gradients contains the gradients of output with index 1 (alphadot) with respect to all 1500 input variables in x. Does the code snippet above behave as I think it does in my description? If not, what am I doing wrong?
Thanks a lot for your help!