By default Pytorch uses autograd to update layer weights based on their contribution to the loss, meaning that layers weights which did not contribute to computing the loss do not have any gradients flowing through them.
For a rather unusual research project I will need to backprop gradients through a network that didn’t compute the output. Essentially I want to define a custom, architecture-specific backward pass. My question is - can I do this in Pytorch? If not then what library would you recommend?
I know it sounds very unintuitive, so here’s an expanded explanation in case it’s unclear:
I have a process P that generates a tensor Z.
I have a network N that maps S to X. The dimensions of X and Z are identical.
I have a ground-truth label Y (same dimensions as X and Z) and I pass it to the loss function together with Z to compute the loss.
I wish to backpropagate the loss through network N and update its weights as if it computed Z.