Is there any way I can get the scalar loss to backprop through the model even when the loss is calculated with tensors not associated with the model? Custom autograd function? Currently I am getting None gradients for the model.

im1 = torch.tensor(..., requires_grad=True) # image
im2 = torch.tensor(..., requires_grad=True) # image 2
im3 = torch.tensor(..., requires_grad=True) # image 3
y = model(im1)
# do ops on im2 based on y that requires breaking grad for y
loss = criterion(im2, im3).mean() # scalar
loss.backward()
params = list(model.parameters())[0]
print(params.grad) # None

If I multiply the loss as follows then I do get gradients for the model:

loss = criterion(im2, im3).mean() * y.mean() / y.mean()

Is multiplying like this a valid way to get the gradients to flow through the model?

If you break the graph, you won’t be able to use the autograd I’m afraid.
The only way to is to wrap these ops that break the graph in a custom autograd.Function and write the backward for that op yourself: https://pytorch.org/docs/stable/notes/extending.html

Thanks! I think I understand the basics of custom autograd.Function, but fail to see how using it is a workaround to the gradients flowing backwards natively?