Omit loss.backward for forward-only algos?

loss.backward() is actually only responsible for populating the .grad fields of the parameters. The optimizer step is what is responsible for updating the parameters using the .grad field

So, re: (2), if you are not performing back-prop it is fine to not use .backward(), but if you still want to use existing optimizers, you’re now responsible for setting the .grad fields