Gradient for only part of a tensor

Is there a way to not calculate the gradient for a part of a tensor? The only way I think I make this work is to calculate the full gradient and then zero-it out.

For example, here I want to free zethe last row of matrix(T) and only calculate and update gradient for the first two rows:

theta = torch.ones(1)
dxy = torch.ones(2)
T = torch.tensor([
    [torch.cos(theta), -torch.sin(theta), dxy[0]],
    [torch.sin(theta), torch.cos(theta), dxy[1]],
    [0.0, 0.0, 1.0],
T.requires_grad = True

# Forwards
out = torch.rand(1, 3).cuda().mm(T)
gt = torch.zeros(1).cuda().long()
loss = torch.nn.functional.cross_entropy(out, gt)

# Back
T.grad[-1] = torch.zeros(3)

# Optimizer step
with torch.no_grad():
    T -= 0.01 * T.grad

After you called.

And based on the fact your tensor T requires grad the gradients will be calculated for the entire T

You cannot make just part of a T getting the gradient. No way to set the mask. The info about the gradients is inside the computational tree = dynamic computational graph.

So, I think the only way is to zero the gradients like you did.

There is one option still, you may concat two tensors T and nogradT, and since concatenation is visible to autograd system, you may set T to requires grad and part of the nogradT not to have gradients. This way you will have gradients only for T.