I am trying to use torch.autograd.functional.jacobian to calculate the gradients of the model parameters with respect to a set of losses, but it returns all zeros. My function definition is as follows
def load_weights(model, names, orig_params, new_params, as_params=False):
param_shapes = [p.shape for p in model.dnn.parameters()]
start = 0
for name, p, new_p, shape in zip(names, orig_params, new_params, param_shapes):
numel = int(torch.prod(torch.tensor(shape)))
set_attr(model.dnn, name.split("."), torch.nn.Parameter(new_params[start:start + numel].view(shape)))
start += numel
def func(param_list):
load_weights(self.model, names, org_param, param_list, 1)
result = self.eq_cons(param_list)
result.requires_grad_()
return result
jac_mtx = torch.autograd.functional.jacobian(func, param_list, strict=1)
The eq_cons function updates the model using the passed parameters, computes a series of loss, and concatenates them into an array with a shape consistent with specific data. When I modify param_list, it can be seen that the return value of the eq_cons and the func function both changes.
However, when I use torch.autograd.functional.jacobian to compute the Jacobian of func with respect to param_list, I find that the resulting Jacobian matrix is all zeros. When I set strict=1, it raises the error:
RuntimeError: Output 0 of the user-provided function is independent of input 0. This is not allowed in strict mode.
To solve this problem, I check the model.parameters() in the end of the load_weights function and I am sure that the parameters are updated. Besides, I attempted to split the output, call backward in a loop, and used the following code to calculate the gradients separately. However, it returned the same result for different terms in the output.
for index in range(len(output)):
self.model.dnn.zero_grad()
item = output[index]
item.backward()
for p in self.model.dnn.parameters():
param_grad = p.grad.detach().data
I am confused about this error because the output of func changes with the input, and I have no idea about how to solve it. Why does this error occur? How can I modify my code to compute the jacobian correctly?