Pytorch most efficient Jacobian / Hessian calculation

I am looking for the most efficient way to get the Jacobian of a function through Pytorch and have so far come up with the following solutions:

def func(X):
    return torch.stack((
                     X.pow(2).sum(1),
                     X.pow(3).sum(1),
                     X.pow(4).sum(1)
                      ),1)  

X = Variable(torch.ones(1,int(1e5))*2.00094, requires_grad=True).cuda()

                                # Solution 1:
t = time()
Y = func(X)
J = torch.zeros(3, int(1e5))
for i in range(3):
    J[i] = grad(Y[0][i], X, create_graph=True, retain_graph=True, allow_unused=True)[0]
print(time()-t)
Output: 0.002 s

                                # Solution 2:
def Jacobian(f,X):
    X_batch = Variable(X.repeat(3,1), requires_grad=True)
    f(X_batch).backward(torch.eye(3).cuda(),  retain_graph=True)
    return X_batch.grad

t = time()
J2 = Jacobian(func,X)
print(time()-t)
Output: 0.001 s

Since there seem to be not a big difference between using a loop in the first solution than the second one, I wanted to ask if there might still be be a faster way to calculate a Jacobian in pytorch.

My other question is then also about what might be the most efficient way to calculate the Hessian.

Finally, does anyone know if something like this can be done easier or more efficient in TensorFlow?

Continuing the discussion from Pytorch most efficient Jacobian / Hessian calculation:

This is the fastest way I can tell
suppose you have a loss function of x

        loss = f(x)
        first_drv = torch.zeros(batch_size, x_dim)
        hessian = torch.zeros(batch_size, x_dim, x_dim)
        for n in range(batch_size):
            first_drv[n] = torch.autograd.grad(loss[n], dz,
                                                     create_graph=True, retain_graph=True)[0][n]
            for i in range(x_dim):
                hessian[n][i] = torch.autograd.grad(first_drv[n][i], dz,
                                                        create_graph=True, retain_graph=True)[0][n]