Gradient wrt grid in grid_sample

Let’s say I have two batches of single-channel images, each of size 8x1x128x128. I am trying to apply spatial transformation to one batch to align it with another one (alignment measured with MSE loss).

This is what relevant part of my code looks like

# two batches of images are img1, img2

# Generating affine grid with Identity transformation
theta = torch.FloatTensor([1, 0, 0, 0, 1, 0])
theta = theta.view(2, 3)
theta = theta.expand(disp.size()[0],2,3)
identity_grid = F.affine_grid(theta,img1.size())

# disp is another variable of size Nx128x128x2, with values in [-1,1]
# net is the network I am using to generate displacements to each pixel
disp = net(img1)

new_grid = identity_grid + disp
# Apply spatial transformation to img1
img1t = F.grid_sample(img1,new_grid)

Lsim = nn.MSELoss()(img1t,img2)
Lsim.backward()

My aim is to backpropagate into parameters of net through disp. But when I do Lsim.backward(), new_grid.grad and disp.grad are both None. What might be going wrong?

new_grid and disp are not leaf nodes of the computation graph so the gradients aren’t accumulated in them. However, you can apply backward hooks to view the gradient at those points.

Thanks for the pointer. :smiley:

How grid_sample does the backward pass? I am confusing how the gradient of the loss w.r.t the index is generated.