Hi! So I’m currently working on trying to learn the best affine transformation for cropping an image. The way I’m doing this is I’m taking an image, then setting a 2x3 matrix’s, lets call it M, parameters as:
[scale * 1, 0, dx
0, scale * 1, dy] , where the scale,dx,dy are the parameters being learned.
for explanation I will refer to these values as:
[a, b, c
d, e, f]
then using affine_grid, grid_sample with the learned matrix to transform the image, and using a custom pixel-to-pixel loss with the ground truth crop image as the loss to learn the matrix M on a single image.
First of all, when I print the dx,dy, and scale parameters’ gradients directly they are different from the gradients of the M matrix. This is wrong.
Second of all, when I print the M gradients, “b”, and “d” variables should never have any gradient as those two are not parameters (yet when I print the M’s gradients they do have non-zero gradients), and then the"a" and “e” should have equal gradients since they’re both just scale*gradient.
Below is the code for you to see, as well as what was printed, please note to compare the Matrix gradients, and individual scale+param gradients for when the counters are equal in the print statements.
cat= TF.to_tensor(np.array(Image.open("images/just_dog.png").convert('RGB'))) cat_dog = TF.to_tensor(np.array(Image.open("images/mask_rect_65_5_30_35_dog_patch.png").convert('RGB'))) folder_path = "images/results_mask_dog_rect_65_5_30_35_patch/" translated_params = torch.unsqueeze(torch.tensor([0.0,0.0]),1) translated_params.requires_grad_(True) scale = torch.unsqueeze(torch.tensor([1.0]),1) scale.requires_grad_(True) counter = 0 loss_level = 0 def forward2(x,dxdy,the_scale): M= torch.cat((torch.eye(2)*the_scale, dxdy),dim=1) M.register_hook(lambda grad : print("counter is: " , counter, grad)) grid = F.affine_grid(torch.unsqueeze(M,dim=0), + list(x.shape)) transformed_image = F.grid_sample(x[None,:,:,:], grid, mode='bilinear') return transformed_image optimizer = torch.optim.Adam([scale,translated_params], lr=0.007) for i in range(601): optimizer.zero_grad() predicted= forward2(cat_dog,translated_params,scale) criterion = LapLoss(loss_level = loss_level) if i == 0: criterion = LapLoss(loss_level = loss_level, save = True) if i%200==0 and i!=0: criterion = LapLoss(loss_level = loss_level,save=True) loss_level = loss_level + 1 if loss_level > 2: loss_level = 0 loss = criterion.forward(torch.unsqueeze(predicted,0),torch.unsqueeze(cat,0)) loss.backward() counter = counter + 1 optimizer.step() print("counter is: ", counter, "gradients are: " , "translated=" , translated_params.grad.data, "scale=", scale.grad.data)
counter is: 0 Matrix grads are: tensor([[-0.1344, 0.0153, 0.4624], [ 0.1192, -0.4274, 0.3631]]) counter is: 1 Inidivdual gradients are: translated= tensor([[0.4624], [0.3631]]) scale= tensor([[-0.5617]]) counter is: 1 Matrix grads are: tensor([[-0.5981, 0.0277, 1.2482], [ 0.1807, -0.8871, 0.9514]]) counter is: 2 Inidivdual gradients are: translated= tensor([[1.2482], [0.9514]]) scale= tensor([[-1.4852]]) counter is: 2 Matrix grads are: tensor([[ 0.0132, 0.1046, 0.6005], [ 0.3383, -0.6538, 0.7037]]) counter is: 3 Inidivdual gradients are: translated= tensor([[0.6005], [0.7037]]) scale= tensor([[-0.6406]]) counter is: 3 Matrix grads are: tensor([[ 0.5854, 0.1962, -0.0363], [ 0.4518, -0.3168, 0.3824]]) counter is: 4 Inidivdual gradients are: translated= tensor([[-0.0363], [ 0.3824]]) scale= tensor([[0.2686]]) counter is: 4 Matrix grads are: tensor([[ 0.8189, 0.1707, -0.2715], [ 0.4192, -0.1115, 0.1299]])
Thank you so much! Please let me know if you want to actually run it yourselves and then I’ll post the rest of the code…