Normally one would use a forward pass to calculate a loss and then perform a backward pass and the gradient is automatically calculated.
My situation is that I don’t have the loss but I have the gradient calculated. My question is how do I set a custom gradient to a network (fully connected) and run the backward optimization using the custom gradient? Thanks
You could assign the precomputed gradients to the
.grad attribute of all parameters and call
optimizer.step() afterwards. Here is a small example:
lin = nn.Linear(1, 1, bias=False)
optimizer = torch.optim.SGD(lin.parameters(), lr=1e-3)
lin.weight.grad = torch.ones_like(lin.weight) * 10
Thank you! What if I have a network? Like a two layers network. Can I assign grad to the last layer and then apply backward() to perform the back propogation for the first layer?
Yes, you could manipulate the gradients manually of any layer using the previous code snippet or you could also use hooks (via
register_hook), which might be cleaner especially if you are using multiple layers.
Thank you for your answer.
I realized that I have a slightly different problem here.
Rather than changing the weight of the layer, what I would like to do is to set the gradient of the loss w.r.t the output of my network. So dLoss/dOutput, which is not part of the network layer. It is more like setting a grad for a tensor variable? To be more specific, my pseudo code would be:
output = myNetwork(input)
my_grad = getMyGradient() #my custom dLoss/dOutput
output.grad = my_grad
output.backward() #usually backward starts from loss, but I would like to start from the output.
What would be the correct implementation for this kind of task?
Thank you in advance.
You can pass the gradient directly to the
backward operation as an argument:
It is working! Thank you!