I have a tenor of a shape (n,m,2), I would like to perform an operation (a rotation) on the last dimension.
That operation depends on a leaf variable that should be backpropagated. I am using the below code but the theta is getting a gradient of None.
left_rotation = torch.tensor([torch.cos(self.theta), -torch.sin(self.theta)])
left = (aligned_grid * left_rotation).sum(dim=2).pow(2) / self.alpha.pow(2)
# Do more things with the "left" variable
The intent here is to broadcast the value of torch.cos(self.theta), -torch.sin(self.theta) on the last dimension.
What would be the proper way to go about this?
When you do torch.tensor([torch.cos(self.theta), ...]), you are creating a new leaf variable which dosen’t connect the graph to theta. So now in your graph there is no way to reach theta, thus no gradient.
Use a list to hold those two values and then do matrix multiply using python @.
Thanks, Kushajveer. This was not working either, I got an operand type not matching error. Besides, that would not broadcast properly if I were to do that right?
I’ve ended up refactoring this function by using .clone() when needed to keep new tensor in the graph.