Hi! So I’m currently working on trying to learn the best affine transformation for cropping an image. The way I’m doing this is I’m taking an image, then setting a 2x3 matrix’s, lets call it M, parameters as:

[scale * 1, 0, dx
0, scale * 1, dy] , where the scale,dx,dy are the parameters being learned.

for explanation I will refer to these values as:
[a, b, c
d, e, f]

then using affine_grid, grid_sample with the learned matrix to transform the image, and using a custom pixel-to-pixel loss with the ground truth crop image as the loss to learn the matrix M on a single image.

Problems:

First of all, when I print the dx,dy, and scale parameters’ gradients directly they are different from the gradients of the M matrix. This is wrong.

Second of all, when I print the M gradients, “b”, and “d” variables should never have any gradient as those two are not parameters (yet when I print the M’s gradients they do have non-zero gradients), and then the"a" and “e” should have equal gradients since they’re both just scale*gradient.

Below is the code for you to see, as well as what was printed, please note to compare the Matrix gradients, and individual scale+param gradients for when the counters are equal in the print statements.

code:

``````cat= TF.to_tensor(np.array(Image.open("images/just_dog.png").convert('RGB')))
translated_params = torch.unsqueeze(torch.tensor([0.0,0.0]),1)

scale = torch.unsqueeze(torch.tensor([1.0]),1)
counter = 0
loss_level = 0

def forward2(x,dxdy,the_scale):

M= torch.cat((torch.eye(2)*the_scale, dxdy),dim=1)

grid = F.affine_grid(torch.unsqueeze(M,dim=0), + list(x.shape))
transformed_image = F.grid_sample(x[None,:,:,:], grid, mode='bilinear')

return transformed_image

for i in range(601):
predicted= forward2(cat_dog,translated_params,scale)
criterion = LapLoss(loss_level = loss_level)

if i == 0:
criterion = LapLoss(loss_level = loss_level, save = True)

if i%200==0 and i!=0:

criterion = LapLoss(loss_level = loss_level,save=True)
loss_level = loss_level + 1
if loss_level > 2:
loss_level = 0

loss = criterion.forward(torch.unsqueeze(predicted,0),torch.unsqueeze(cat,0))

loss.backward()

counter = counter + 1

optimizer.step()

``````

printed:

``````counter is:  0 Matrix grads are:  tensor([[-0.1344,  0.0153,  0.4624],
[ 0.1192, -0.4274,  0.3631]])
counter is:  1 Inidivdual gradients are:  translated= tensor([[0.4624],
[0.3631]]) scale= tensor([[-0.5617]])
counter is:  1 Matrix grads are:  tensor([[-0.5981,  0.0277,  1.2482],
[ 0.1807, -0.8871,  0.9514]])
counter is:  2 Inidivdual gradients are:  translated= tensor([[1.2482],
[0.9514]]) scale= tensor([[-1.4852]])
counter is:  2 Matrix grads are:  tensor([[ 0.0132,  0.1046,  0.6005],
[ 0.3383, -0.6538,  0.7037]])
counter is:  3 Inidivdual gradients are:  translated= tensor([[0.6005],
[0.7037]]) scale= tensor([[-0.6406]])
counter is:  3 Matrix grads are:  tensor([[ 0.5854,  0.1962, -0.0363],
[ 0.4518, -0.3168,  0.3824]])
counter is:  4 Inidivdual gradients are:  translated= tensor([[-0.0363],
[ 0.3824]]) scale= tensor([[0.2686]])
counter is:  4 Matrix grads are:  tensor([[ 0.8189,  0.1707, -0.2715],
[ 0.4192, -0.1115,  0.1299]])
``````

Thank you so much! Please let me know if you want to actually run it yourselves and then I’ll post the rest of the code…