The dimension and size of autograd.grad

israrbacha · February 13, 2021, 9:44am

Hi everyone, I am facing a problem regarding the shape and dimension of autograd.grad tensor. I wonder it maybe a python related issue but i failed to find any satisfactory answer. So kindly help me.

'import torch

from torch.autograd import grad

import torch.nn as nn

Create some dummy data.

x = torch.ones(2, 2, requires_grad=True)

gt = torch.ones_like(x) * 16 - 0.5 # “ground-truths”

We will use MSELoss as an example.

loss_fn = nn.MSELoss()

Do some computations.

v = x + 2

y = v ** 2

Compute loss.

loss = loss_fn(y, gt)

print(f’Loss: {loss}’)

Now compute gradients:

d_loss_dx = grad(outputs=loss, inputs=x)

print(len(d_loss_dx))

print(f’dloss/dx:\n {d_loss_dx}’)’

when i run the above code the output is:
Loss: 42.25
1
dloss/dx:
(tensor([[-19.5000, -19.5000],
[-19.5000, -19.5000]]),)
but in the same code when i change the line:
d_loss_dx = grad(outputs=loss, inputs=x)[0]
the output is:
Loss: 42.25
2
dloss/dx:
tensor([[-19.5000, -19.5000],
[-19.5000, -19.5000]])

I didn’t understand the difference please help Thanks!!!

israrbacha · February 13, 2021, 12:48pm

d_loss_dx = grad(outputs=loss, inputs=x)[0]
The difference is here, when i remove “[0]” it change the tensor shape but the values are same. I put the results in the above question.

AlphaBetaGamma96 · February 13, 2021, 12:55pm

torch.autograd.grad will returns your gradients in a tuple. Which for a single gradient Tensor will be like what you have above

(tensor([[-19.5000, -19.5000],
[-19.5000, -19.5000]]),)

(A Tensor wrapped inside brackets with a comma at the end.)

However, when you add [0] at the end of d_loss_dx = grad(outputs=loss, inputs=x)[0], you’re telling PyTorch to return the 0th element of that tuple which in your case is the tensor within it, i.e.,

tensor([[-19.5000, -19.5000],
[-19.5000, -19.5000]])

I hope that makes sense?

You can get similar behaviour without adding [0] at the end by placing a comma after your d_loss_dx variable, i.e.,

d_loss_dx,  = grad(outputs=loss, inputs=x)

will return

tensor([[-19.5000, -19.5000],
[-19.5000, -19.5000]])