No_grad() vs requires_grad

borisbranson · July 17, 2018, 11:45am

I know this is a very basic question, but it’s my first day with pytorch and I can’t seem to figure it out? What is the difference between no_grad() and requires_grad, and when to use each of them, and when/how to mix them?

Thanks guys.

Best,
Boris

ptrblck · July 17, 2018, 11:56am

with torch.no_grad() is a context manager and is used to prevent calculating gradients in the following code block.
Usually it is used when you evaluate your model and don’t need to call backward() to calculate the gradients and update the corresponding parameters.
Also, you can use it to initialize the weights with torch.nn.init functions, since you don’t need the gradients there either.

requires_grad on the other hand is used when creating a tensor, which should require gradients. Usually you don’t need this in the beginning, as all parameters which require gradients are already wrapped in nn.Modules in the nn package.
You could set this property e.g. on your input tensor, if you need to update your input for example in an adversarial training setup.