I have the following implementation
def T_gradient(T, x):
gradT = torch.empty((len(x), N, 2))
for inc in range(N):
gradT[:, inc, :] = torch.autograd.grad(T[:, inc, :], x, torch.ones((T.size()[0], 1), device=device),create_graph=True, retain_graph=True)[0]
return gradT
T is obtained from a seq2seq model which has the size of (number of examples, N, 1), where N is the sequence length. My question is there a way to avoid using the for loop. x has the size of (number of examples, 2)