Backpropagating loss through model after purposefully breaking graph

btil · October 16, 2020, 1:24am

Is there any way I can get the scalar loss to backprop through the model even when the loss is calculated with tensors not associated with the model? Custom autograd function? Currently I am getting None gradients for the model.

im1 = torch.tensor(..., requires_grad=True) # image 
im2 = torch.tensor(..., requires_grad=True) # image 2
im3 = torch.tensor(..., requires_grad=True) # image 3

y = model(im1)

# do ops on im2 based on y that requires breaking grad for y

loss = criterion(im2, im3).mean() # scalar
loss.backward()

params = list(model.parameters())[0]                                       
print(params.grad)  # None

If I multiply the loss as follows then I do get gradients for the model:

loss = criterion(im2, im3).mean() * y.mean() / y.mean()

Is multiplying like this a valid way to get the gradients to flow through the model?

albanD · October 16, 2020, 2:47pm

Hi,

If you break the graph, you won’t be able to use the autograd I’m afraid.
The only way to is to wrap these ops that break the graph in a custom autograd.Function and write the backward for that op yourself: https://pytorch.org/docs/stable/notes/extending.html

btil · October 16, 2020, 2:54pm

Thanks! I think I understand the basics of custom autograd.Function, but fail to see how using it is a workaround to the gradients flowing backwards natively?

albanD · October 16, 2020, 2:56pm

The custom function allows you to overwrite the backward pass and thus replace the whatever the autograd would have done for what is in the forward.

btil · October 16, 2020, 2:57pm

So I could reconnect the broken graph with this method?

albanD · October 16, 2020, 3:27pm

More or less.
As would see it as hiding from the autograd the fact that you break the graph