Get the gradient of the network parameters

Guys, I am stucking on getting the gradients of a network’s parameters. So basically, what I wanna do is creating a long one-dimensional tensor which stores all the gradients of the parameters(weights, bias) of a network without updating the parameters with Optimizer_Network.step(), the reason why I have to use this one-dimensional tensor is that I have to do some other operations later with it. How can I get it?

3 Likes

You could iterate all parameters and store each gradient in a list:

model = models.resnet50()
# Calculate dummy gradients
model(torch.randn(1, 3, 224, 224)).mean().backward()
grads = []
for param in model.parameters():
    grads.append(param.grad.view(-1))
grads = torch.cat(grads)
print(grads.shape)
> torch.Size([25557032])
9 Likes

@ptrblck Thanks bro, It works!

Hi the solution of @ptrblck works for me as well, but is there a more efficient way to do this? Possibly without a for loop especially for networks with large number of parameters.

3 Likes

Yes, I have the same question. Is there any way to get the gradients of the parameters directly from the optimizer object without accessing the original parameters through the model object?
Thanks!

Yes, you could use the stored references to the parameters and use them to check their gradients:

for p in optimizer.param_groups[0]['params']:
    print(p.grad)

but I don’t know what kind of advantage it would give you.

Thanks a lot. I think the advantage that I was looking for is that I can directly access the parameters which are to be optimized if they are a subset of the entire set of model parameters.

Ah OK, that makes sense. I was unsure about your use case as I thought you might be expecting a faster access or so.

1 Like