Hi everybody,

I’ve trained a model on 3D data and now I’d like to add learnable rotation matrix and apply it just after taking new batch (with frozen model parameters).

This is my Rotation_Matrix class:

``````class Rotation_Matrix(nn.Module):
def __init__(self):
super(Rotation_Matrix, self).__init__()
self.a = nn.Parameter(torch.tensor(1.)) # init value of Rx cosine
self.b = nn.Parameter(torch.tensor(1.)) # init value of Ry cosine
self.c = nn.Parameter(torch.tensor(1.)) # init value of Rz cosine

def get_rotation_matrix(self):
Rx = torch.tensor([[1., 0., 0.],
[0., self.a, -torch.sqrt(1 - torch.pow(self.a,2))],
[0., torch.sqrt(1 - torch.pow(self.a,2)), self.a]])

Ry = torch.tensor([[self.b, 0., torch.sqrt(1 - torch.pow(self.b,2))],
[0., 1., 0.],
[-torch.sqrt(1 - torch.pow(self.b,2)), 0, self.b]])

Rz = torch.tensor([[self.c, -torch.sqrt(1 - torch.pow(self.c,2)), 0.],
[torch.sqrt(1 - torch.pow(self.c,2)), self.c, 0.],
[0., 0., 1.]])

def forward(self, x):
self.matrix = self.get_rotation_matrix()
``````

Then initialisation and training of rotation matrix. I haven’t taken care of forcing rotation’s parameters to stay in [-1,1] interval just yet.

``````# load pre-trained model
model.eval()

rotation_matrix = Rotation_Matrix()

rotation_matrix.train()

for i in range(n_epochs):
x = rotation_matrix(x)
x = model(x)

loss = loss_fun(x)

loss.backward(retain_graph=True)

``````

Printing grad of any parameter from rotation_matrix gives none value.
I read some related topics but I didn’t find solution. I tried using ‘retain_grad()’ on parameters before calling ‘loss.backward()’ and I played with ‘autograd.grad()’ function but with no results. What am I missing here? In addition, do I use ‘with torch.no_grad():’ on the ‘x = model(x)’ or it wont allow to backpropagate rotation’s parameters?

MS.

Hi, So you are missing one important part which is calling `optimizer.step()` after `loss.backward()`. The `.backward` function merely computes the gradients of each parameter with respect to the loss, it is the `.step()` function that actually applies the updates to the parameters. Also doing `optimizer.zero_grad()` will set the gradients to 0, there is no need to call `rotation_matrix.zero_grad()` after it.

Hi Diego,

Of course, I forgot this line. However, still any gradients aren’t computed. As I mentioned in the first post, the problem is with None values of parameters’ grads after calling `.backward` function, not actually with applying the updates with correctly computed gradients (which is done by `optimizer.step()`).

`self.matrix` will be a plain `tensor`, not an `nn.Parameter`, if you wrap just single `nn.Parameters` in another tensor.
You can check if by calling `print(rotation_matrix.matrix)`.

If you create the whole `matrix` as an `nn.Parameter`, the code should work:

``````class Rotation_Matrix(nn.Module):
def __init__(self):
super(Rotation_Matrix, self).__init__()
self.a = torch.tensor(1.) # init value of Rx cosine
self.b = torch.tensor(1.) # init value of Ry cosine
self.c = torch.tensor(1.) # init value of Rz cosine
self.matrix = self.get_rotation_matrix()

def get_rotation_matrix(self):
Rx = torch.tensor([[1., 0., 0.],
[0., self.a, -torch.sqrt(1 - torch.pow(self.a,2))],
[0., torch.sqrt(1 - torch.pow(self.a,2)), self.a]])

Ry = torch.tensor([[self.b, 0., torch.sqrt(1 - torch.pow(self.b,2))],
[0., 1., 0.],
[-torch.sqrt(1 - torch.pow(self.b,2)), 0, self.b]])

Rz = torch.tensor([[self.c, -torch.sqrt(1 - torch.pow(self.c,2)), 0.],
[torch.sqrt(1 - torch.pow(self.c,2)), self.c, 0.],
[0., 0., 1.]])

return nn.Parameter(torch.mm(Rx, torch.mm(Ry, Rz)))

def forward(self, x):

model = nn.Linear(3, 1)
model.eval()

rotation_matrix = Rotation_Matrix()

rotation_matrix.train()

x = torch.randn(1, 3)
target = torch.randn(1, 1)
criterion = nn.MSELoss()

for i in range(10):

output = rotation_matrix(x)
output = model(output)
loss = criterion(output, target)

loss.backward()
After changes you suggested, gradients are computed correctly for every element in the matrix. This cause another problem. After update of weights the matrix is no longer a rotation matrix (orthogonal and with determinant equal to 1). I want to remain its structure so that after each iteration it would be still a rotation matrix, just like in constructor. So I’d like to update only parameters `a, b, c`.
In addition, next question came up. Where do I restrict `a, b, c` to stay in the [-1,1] segment to force these parameters to act like cosine functions? I thought of something like:
``````rotation_matrix.a = torch.max(torch.tensor(-1.), torch.min(a, torch.tensor(1.)))