Is there a way to not calculate the gradient for a part of a tensor? The only way I think I make this work is to calculate the full gradient and then zero-it out.
For example, here I want to free zethe last row of matrix(T) and only calculate and update gradient for the first two rows:
theta = torch.ones(1)
dxy = torch.ones(2)
T = torch.tensor([
[torch.cos(theta), -torch.sin(theta), dxy[0]],
[torch.sin(theta), torch.cos(theta), dxy[1]],
[0.0, 0.0, 1.0],
]).type(torch.cuda.FloatTensor).cuda()
T.requires_grad = True
# Forwards
out = torch.rand(1, 3).cuda().mm(T)
gt = torch.zeros(1).cuda().long()
loss = torch.nn.functional.cross_entropy(out, gt)
# Back
loss.backward()
T.grad[-1] = torch.zeros(3)
# Optimizer step
with torch.no_grad():
T -= 0.01 * T.grad