Hello, I want to calculate Jacobian matrices for a batch of data.
I have x
(batch_size, 3) and calculated y
(batch_size, 3). I need a Jacobian matrix of shape (batch_size, 3, 3)
I tried the following code:
import torch
x = torch.randn(1024, 3, requires_grad=True) # a batch of coordinates
y = torch.sin(x) # calculated y in forward pass
grads = []
for i_dim in range(y.shape[1]):
y_i = y[:, i_dim]
ones = torch.ones_like(y_i)
# `torch.autograd.grad` seems to implicitly add dimensions up.
# So I separately calculate grad for every dim of `y`
# But this still fails.
grad = torch.autograd.grad(y_i, x, grad_outputs=ones, create_graph=True, retain_graph=True, is_grads_batched=True)[0]
grads.append(grad)
grad = torch.cat(grad, dim=1)
print(grad.shape)
RuntimeError: If `is_grads_batched=True`,
we interpret the first dimension of each grad_output as the batch dimension.
The sizes of the remaining dimensions are expected to match the shape of corresponding output,
but a mismatch was detected: grad_output[0] has a shape of torch.Size([]) and output[0] has a shape of torch.Size([1024]).
If you only want some tensors in `grad_output` to be considered batched, consider using vmap.
I think outputs
and grad_output
have already the same shape. Why is this error raised?
BTW, I tried torch.func.jacrev()
, but I need to calculate y
beforehand, not in the closure of jacrev()
. My code has been written this way.