No_grad() vs requires_grad

ptrblck · July 17, 2018, 11:56am

with torch.no_grad() is a context manager and is used to prevent calculating gradients in the following code block.
Usually it is used when you evaluate your model and don’t need to call backward() to calculate the gradients and update the corresponding parameters.
Also, you can use it to initialize the weights with torch.nn.init functions, since you don’t need the gradients there either.

requires_grad on the other hand is used when creating a tensor, which should require gradients. Usually you don’t need this in the beginning, as all parameters which require gradients are already wrapped in nn.Modules in the nn package.
You could set this property e.g. on your input tensor, if you need to update your input for example in an adversarial training setup.