Need to get gradients w.r.t 1 variable that I use to calculate an affine transformation matrix

Hello, I have the following snippet:

def param2theta(param, w=25, h=1):

   param = torch.pinverse(param)
   theta = torch.zeros([2,3])
   theta[0,0] = param[0,0]
   theta[0,1] = param[0,1]*h/w
   theta[0,2] = param[0,2]*2/w + theta[0,0] + theta[0,1] - 1
   theta[1,0] = param[1,0]*w/h
   theta[1,1] = param[1,1]
   theta[1,2] = param[1,2]*2/h + theta[1,0] + theta[1,1] - 1

   return theta.unsqueeze(0)


mat = torch.ones(1,26).view(1,1,1,-1).float()


##scale##
y =  Variable(torch.tensor([.2]), requires_grad = True)
s = y*(torch.tensor([[ [1.,0.,0],[0,0.,0], [0,0,0] ]]).reshape( 1,3,3) ).requires_grad_() + \
(torch.tensor([[ [0,0.,0],[0,1.,0], [0,0,1] ]]).reshape( 1,3,3) )

##translation##
z =  Variable(torch.tensor([ y* mat.size(-1) ]), requires_grad = True)
t = z*(torch.tensor([[ [0,0.,1.],[0,0,0.], [0,0,0] ]]).reshape( 1,3,3) ).requires_grad_() + \
(torch.tensor([[ [1,0.,0],[0,1,0.], [0,0,1] ]]).reshape( 1,3,3) )

f=torch.bmm(t,s)
theta = param2theta(f.squeeze())
grid = F.affine_grid(theta, mat.size())

mat = F.grid_sample(mat, grid, mode='nearest')

print('after grid_sample',mat,'\n \n \n')

mat.sum().backward()


print(y.grad)
print(z.grad)

but when calling mat.sum().backward() it says that y.grad and z.grad are 0.

What can I do in order to get the gradients w.r.t y ?

Hi,

Without looking too much into detail I can tell you that the gradient of mat = F.grid_sample(mat, grid, mode='nearest') is 0 because of mode=nearest. Try with any other option and you should have a non-zero gradient.

Hope this helps.

(To be as clear as possible, gradient is 0 in that case because your function is piece-wise constant with mode="nearest".)

Hello and thank you for your time.
Just changed mode=‘nearest’ to mode=‘bilinear’ and still it outputs 0. gradients :frowning:
Any other suggestions?

Actually, I don’t know why, but I think colab had lagged on me and that’s why it didn’t change anything, even after I changed the mode, as you suggested. After refreshing, with mode=‘bilinear’, it works as expected! thank you!