How to compute derivatives of outputs w.r.t. parts of inputs in a differentiable manner, i.e. specific columns of Jacobian matrix

I am trying to compute the Jacobian matrix in a differentiable manner using torch.autograd.functional.jacobian with create_graph=True, but the dimensions of outputs and inputs are too large and the GPU running out of memory.

I found I just need some specific columns of the Jacobian matrix, ie the derivative of outputs w.r.t. some specific parts of inputs. If I can calculate that part of derivative in a differentiable manner, that would save lots of memory.

So I tried the following:

x = torch.tensor([1., 2., 3., 4., 5., 6.], requires_grad=True)
y = x
d = torch.autograd.grad(y[0], x[0],  retain_graph=True, create_graph=True, allow_unused=True)
print(d)

output: 
(None,)

But just got None.

Is there any way to compute derivatives of outputs w.r.t. parts of inputs in a differentiable manner?
Or any other methods of computing specific columns of the Jacobian matrix in a differentiable manner?

I’m not sure why the above code cannot get the answer.
But I figure out another way:

x0 = torch.tensor([1., 2., 3.], requires_grad=True)
x1 = torch.tensor([4., 5., 6.], requires_grad=False)
x = torch.stack([x0, x1], dim=0)
y = x
d = torch.autograd.grad(y[0, 0], x0,  retain_graph=True, create_graph=True)
print(d)

output:
(tensor([1., 0., 0.]),)

But in this way, I have to take out the part of input x I need to calculate derivatives and then stack it with other parts of the input.

I used this method to calculate one column of the Jacobian matrix in a differentiable manner, but still out of memory. Any ideas for saving memory? The output size is (170, 2) and one part of inputs is (1, 768).