Get the gradient of the network parameters

Wesker_Rongkai_Ma · July 14, 2019, 12:03pm

Guys, I am stucking on getting the gradients of a network’s parameters. So basically, what I wanna do is creating a long one-dimensional tensor which stores all the gradients of the parameters(weights, bias) of a network without updating the parameters with Optimizer_Network.step(), the reason why I have to use this one-dimensional tensor is that I have to do some other operations later with it. How can I get it?

ptrblck · July 14, 2019, 12:28pm

You could iterate all parameters and store each gradient in a list:

model = models.resnet50()
# Calculate dummy gradients
model(torch.randn(1, 3, 224, 224)).mean().backward()
grads = []
for param in model.parameters():
    grads.append(param.grad.view(-1))
grads = torch.cat(grads)
print(grads.shape)
> torch.Size([25557032])

Wesker_Rongkai_Ma · July 15, 2019, 12:33am

@ptrblck Thanks bro, It works!

mahf · May 4, 2020, 6:08pm

Hi the solution of @ptrblck works for me as well, but is there a more efficient way to do this? Possibly without a for loop especially for networks with large number of parameters.

Megh_Bhalerao · August 30, 2022, 4:06pm

Yes, I have the same question. Is there any way to get the gradients of the parameters directly from the optimizer object without accessing the original parameters through the model object?
Thanks!

ptrblck · August 30, 2022, 5:06pm

Yes, you could use the stored references to the parameters and use them to check their gradients:

for p in optimizer.param_groups[0]['params']:
    print(p.grad)

but I don’t know what kind of advantage it would give you.

Megh_Bhalerao · August 30, 2022, 8:12pm

Thanks a lot. I think the advantage that I was looking for is that I can directly access the parameters which are to be optimized if they are a subset of the entire set of model parameters.

ptrblck · August 30, 2022, 9:09pm

Ah OK, that makes sense. I was unsure about your use case as I thought you might be expecting a faster access or so.