Is there an efficient way to remove the for-loop? Maybe use torch.vmap?
import torch
a = torch.tensor([1.0], requires_grad=True)
b = torch.exp(torch.rand(100, 100)) * a
grad = torch.zeros(100, 100)
for i in range(b.shape[0]):
for j in range(b.shape[1]):
grad[i, j] = torch.autograd.grad(b[i, j], a, create_graph=True)[0]
print(grad)
Backward-mode automatic differentiation naturally gives you the gradient
of a scalar function, that is, the (partial) derivatives of that scalar with
respect to the elements of one or more tensors.
Conversely, forward-mode automatic differentiation naturally gives you the
derivatives of the elements of one or more tensors with respect to a single
scalar, and it does so with a single forward-mode pass.
For your specific example, that’s the best way to go.