Does the shape of pytorch gradient always be the same as input?

when I do backward() to some non-scalar variables $y$, the shape of result is always the same as input $x$.

Is there any method to get a y-shaped result?

e.g.

y = model(x) # x.shape: (B, 1), y.shape: (B, K)
y.backward(torch.ones_like(y)) 
x.grad.shape == x.shape # (B, 1)

>>> True

But what I want to get is
$$
\frac{\part y}{\part x} = (\frac{\part y_1}{\part x}, \frac{\part y_2}{\part x}, … ,\frac{\part y_k}{\part x})^T
$$
a result of shape (B, K).

Now my solution is to write a for loop to get each $\frac{\part y_i}{\part x}$ , but it is too slow, is there any better way?

x_grads = []
for i in range(y.shape[-1]): # K
    x_grad = torch.autograd.grad(y[..., i],
                                 x,
                                 torch.ones_like(y[..., i]),
                                 retain_graph=True)[0]
    x_grads.append(t_grad)

x_grads = torch.cat(x_grads, dim=-1)

Hi Fangyu!

I’m not sure that I understand your exact use case, but the beta-version
torch.autograd.functional.jacobian() might be what you want:

>>> torch.__version__
'1.7.1'
>>> def my_model (x):
...     return torch.arange (2 * x.numel()).reshape ((2, -1)) * x * x
...
>>> x = torch.tensor ([2.0, 3.0, 5.0])
>>> my_model (x)
tensor([[  0.,   9.,  50.],
        [ 12.,  36., 125.]])
>>> torch.autograd.functional.jacobian (my_model, x)
tensor([[[ 0.,  0.,  0.],
         [ 0.,  6.,  0.],
         [ 0.,  0., 20.]],

        [[12.,  0.,  0.],
         [ 0., 24.,  0.],
         [ 0.,  0., 50.]]])

Best.

K. Frank

1 Like