Vectorization in torch.autograd

Hi. I have an input vector input in of dimension (Nxd) of N samples which get mapped to an output vector output of dimensions (Nxm), i.e. each sample gets mapped from R^d ro R^m.
Now, I want to compute the gradients of each entry of the result with respect to the corresponding input sample, i.e, I want to obtain a N x m x d (or N x d x m or similar) tensor A with the respective gradients. Until now, I used a for loop over the second dimension of the output vector (i.e. let i run from 1 to m), selected o[:,i] and used the command
torch.autograd.grad(output[:,i], input, torch.ones_like(o[:,i]))[0]
to obtain a Nxd vector on each of the m iterations of the for loop. However, to speed up my code, I now want to vectorize this procedure. Sadly, I could not see how to do this. Is there a simple way by just using torch.autograd.grad or if not, how can I proceed?