// - tensor : output of my neural network
// tensor.requires_grad = True
warped_tensor = F.grid_sample(tensor,
grid,
align_corners=True,
mode='bilinear',
padding_mode='zeros').

This operation returns a gradient, however it seems to be not correct. I used the warped_tensor and just the tensor for my loss and with the warped_tensor my network does not correctly optimise the weights.

Is this an autograd issue or is there some other issue I am not seeing here?

// - tensor : output of my neural network
// tensor.requires_grad = True
// - M : transformation Matrix prev created with kornia
warped_tensor: torch.tensor = kornia.warp_affine(tensor, M, dsize=(h_original, w_original))

This is most likely due to your network. This function is quite widely used and tested so I think it is correct.
You can double check by using torch.autograd.gradcheck.gradcheck function wil double typed inputs to verify that the computed gradients are correct.

Will check my network with torch.autograd.gradcheck.gradcheck.

However isn’t it weird that my network works when I use loss(tensor, truth) but not with loss(warped_tensor, warped_truth) where warped_truth was applied the exact same transformation.

I will try to provide you with some executable code soon.

This makes sense! I am quite certain now that my error lies here.

Here is a gist using kornia.warp_affine for the transformation as they use F.grid_sample under the hood. (source for kornia.warp_affine)

Here is my code:

How can I now make sure that my gradient is not loosing precision? Must the size of the resulting trafo tensor just be big enough? The interpolation with nearest might also have this problem. Would bilinear fix that or should I make sure that I dont scale down?