Can you do an optimization step by just setting param.grad?

For example, without a backwards pass:

for param in model.parameters():
       param.grad = 3.1415
optimization.step()

What about doing a backwards pass but then modifying the grads:

loss.backward()
for param in model.parameters():
       param.grad += 3.1415
optimization.step()

Yes, this works. By default, autograd.backward() computes the gradients without storing the computation graph for the gradients (in case you wanted to e.g. do a double backward). So making non-differentiable modifications to the gradients, if you aren’t planning on doing something like MAML or gradient penalty, should be fine.

If you look at the implementation of e.g. torch.optim.SGD, you see it just makes an in-place update to the model parameter p with whatever is in p.grad.