Autograd issue with F.grid_sample()

I need the gradient of this warping operation.

// - tensor   :  output of my neural network 
//               tensor.requires_grad = True

warped_tensor = F.grid_sample(tensor, 
                              grid,
                              align_corners=True,
                              mode='bilinear',
                              padding_mode='zeros').

This operation returns a gradient, however it seems to be not correct. I used the warped_tensor and just the tensor for my loss and with the warped_tensor my network does not correctly optimise the weights.

Is this an autograd issue or is there some other issue I am not seeing here?


The same happens when I use #vision:kornia

// - tensor   :  output of my neural network 
//               tensor.requires_grad = True
// - M      : transformation Matrix prev created with kornia

warped_tensor: torch.tensor = kornia.warp_affine(tensor, M, dsize=(h_original, w_original))

Hi,

This is most likely due to your network. This function is quite widely used and tested so I think it is correct.
You can double check by using torch.autograd.gradcheck.gradcheck function wil double typed inputs to verify that the computed gradients are correct.

Will check my network with torch.autograd.gradcheck.gradcheck.

However isn’t it weird that my network works when I use loss(tensor, truth) but not with loss(warped_tensor, warped_truth) where warped_truth was applied the exact same transformation.

I will try to provide you with some executable code soon.

Depends on the warping you apply.
You can loose precision when you warp if you collapse multiple input pixels into a single output one?

I can take a look if you have a small code sample yes !

This makes sense! I am quite certain now that my error lies here.

Here is a gist using kornia.warp_affine for the transformation as they use F.grid_sample under the hood. (source for kornia.warp_affine)

Here is my code:


How can I now make sure that my gradient is not loosing precision? Must the size of the resulting trafo tensor just be big enough? The interpolation with nearest might also have this problem. Would bilinear fix that or should I make sure that I dont scale down?

I don’t have kornia at hand to try it but yes these sound like good things to try out.

Has there been any update on this ? My first guesses were too that nearest mode might not work but that did not help in my case.