Manually specifying gradients in optimizer

For illustration, here’s a toy model:

input = torch.distributions.normal.Normal(loc = 0, scale = 1).sample([50,5])
target = torch.ones([50, 1])
simple_model = torch.nn.Linear(5,1)
output = simple_model(input)

which I could optimise through standard procedures:

optimizer = optim.Adam(simple_model.parameters(), lr=1e-3)
criterion = torch.nn.MSELoss()
loss = criterion(output, target)
loss.backward()
optimizer.step()

However, my actual application is more complicated, and I deal with explicit gradients, not a loss. For example I only have access to the gradients returned by torch.autograd.grad(_, simple_model.parameters()). Is there anyway I can use these gradients to directly udpate the parameters of simple_model without writing my own optimiser?

3 Likes

Have a look at this tutorial, where a manual update is described.
If It’s possible to assign your custom gradients to the .grad attribute of all parameters, you could still use an optimizer directly.

2 Likes

Is the pseudocode something like this:

loss = criterion(mdl(x), y)
loss.backward()
for name, w in mdl.named_parameters():
    w.grad = w.grad + 1.0

note: I wanted to do something like the above but collect the gradients from an rpc (hence my curiosity for actual code example but the link didn’t seem super useful, I’d probably do something like w.grad = rpc_grads in my real example)

Your code looks fine for a single model, but I’m not sure how you are using it in the RPC use case, so you might need to gather the gradients first etc.