Hi! So I’m currently working on trying to learn the best affine transformation for cropping an image. The way I’m doing this is I’m taking an image, then setting a 2x3 matrix’s, lets call it M, parameters as:
[scale * 1, 0, dx
0, scale * 1, dy] , where the scale,dx,dy are the parameters being learned.
for explanation I will refer to these values as:
[a, b, c
d, e, f]
then using affine_grid, grid_sample with the learned matrix to transform the image, and using a custom pixel-to-pixel loss with the ground truth crop image as the loss to learn the matrix M on a single image.
Problems:
First of all, when I print the dx,dy, and scale parameters’ gradients directly they are different from the gradients of the M matrix. This is wrong.
Second of all, when I print the M gradients, “b”, and “d” variables should never have any gradient as those two are not parameters (yet when I print the M’s gradients they do have non-zero gradients), and then the"a" and “e” should have equal gradients since they’re both just scale*gradient.
Below is the code for you to see, as well as what was printed, please note to compare the Matrix gradients, and individual scale+param gradients for when the counters are equal in the print statements.
code:
cat= TF.to_tensor(np.array(Image.open("images/just_dog.png").convert('RGB')))
cat_dog = TF.to_tensor(np.array(Image.open("images/mask_rect_65_5_30_35_dog_patch.png").convert('RGB')))
folder_path = "images/results_mask_dog_rect_65_5_30_35_patch/"
translated_params = torch.unsqueeze(torch.tensor([0.0,0.0]),1)
translated_params.requires_grad_(True)
scale = torch.unsqueeze(torch.tensor([1.0]),1)
scale.requires_grad_(True)
counter = 0
loss_level = 0
def forward2(x,dxdy,the_scale):
M= torch.cat((torch.eye(2)*the_scale, dxdy),dim=1)
M.register_hook(lambda grad : print("counter is: " , counter, grad))
grid = F.affine_grid(torch.unsqueeze(M,dim=0),[1] + list(x.shape))
transformed_image = F.grid_sample(x[None,:,:,:], grid, mode='bilinear')[0]
return transformed_image
optimizer = torch.optim.Adam([scale,translated_params], lr=0.007)
for i in range(601):
optimizer.zero_grad()
predicted= forward2(cat_dog,translated_params,scale)
criterion = LapLoss(loss_level = loss_level)
if i == 0:
criterion = LapLoss(loss_level = loss_level, save = True)
if i%200==0 and i!=0:
criterion = LapLoss(loss_level = loss_level,save=True)
loss_level = loss_level + 1
if loss_level > 2:
loss_level = 0
loss = criterion.forward(torch.unsqueeze(predicted,0),torch.unsqueeze(cat,0))
loss.backward()
counter = counter + 1
optimizer.step()
print("counter is: ", counter, "gradients are: " , "translated=" , translated_params.grad.data, "scale=", scale.grad.data)
printed:
counter is: 0 Matrix grads are: tensor([[-0.1344, 0.0153, 0.4624],
[ 0.1192, -0.4274, 0.3631]])
counter is: 1 Inidivdual gradients are: translated= tensor([[0.4624],
[0.3631]]) scale= tensor([[-0.5617]])
counter is: 1 Matrix grads are: tensor([[-0.5981, 0.0277, 1.2482],
[ 0.1807, -0.8871, 0.9514]])
counter is: 2 Inidivdual gradients are: translated= tensor([[1.2482],
[0.9514]]) scale= tensor([[-1.4852]])
counter is: 2 Matrix grads are: tensor([[ 0.0132, 0.1046, 0.6005],
[ 0.3383, -0.6538, 0.7037]])
counter is: 3 Inidivdual gradients are: translated= tensor([[0.6005],
[0.7037]]) scale= tensor([[-0.6406]])
counter is: 3 Matrix grads are: tensor([[ 0.5854, 0.1962, -0.0363],
[ 0.4518, -0.3168, 0.3824]])
counter is: 4 Inidivdual gradients are: translated= tensor([[-0.0363],
[ 0.3824]]) scale= tensor([[0.2686]])
counter is: 4 Matrix grads are: tensor([[ 0.8189, 0.1707, -0.2715],
[ 0.4192, -0.1115, 0.1299]])
Thank you so much! Please let me know if you want to actually run it yourselves and then I’ll post the rest of the code…