loss.backward() is actually only responsible for populating the .grad fields of the parameters. The optimizer step is what is responsible for updating the parameters using the .grad field
So, re: (2), if you are not performing back-prop it is fine to not use .backward(), but if you still want to use existing optimizers, you’re now responsible for setting the .grad fields