Is there any difference between calling "requires_grad_()" method and manually set "requires_grad" attribute?

I found that there are two ways to change Tensor.requires_grad

I can manually set x.requires_grad = flag or I can call the method x.requires_grad_(flag). I did a simple experiment, and found nothing difference.

In [14]: a = torch.tensor([2.0])

In [15]: a.requires_grad
Out[15]: False

In [16]: a.requires_grad_()
Out[16]: tensor([2.], requires_grad=True)

In [17]: a.requires_grad
Out[17]: True

In [18]: a.requires_grad = False

In [19]: a
Out[19]: tensor([2.])

I’m wondering that is there any detail difference between the aforementioned two ways? Dose the requires_grad_(flag) do any other things except self.requires_grad = flag?

I tried to read the source code of requires_grad_, while found nothing, it seems to be written in C code~~ :expressionless:

3 Likes

There are some subtle differences between the two:
For example if you have a non-leaf tensor, setting it to True using self.requires_grad=True will produce an error, but not when you do requires_grad_(True).
Both perform some error checking, such as verifying that the tensor is a leaf, before calling into the same set_requires_grad function (implemented in cpp). But the exact checks done, and error messages produced are not identical:

>>> a = torch.tensor(1., requires_grad=True)
>>> b = a + 1
>>> b.requires_grad_(True)
tensor(2., grad_fn=<AddBackward0>)
>>> b.requires_grad_(False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> b.requires_grad = True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables.
>>> b.requires_grad = False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> 

The difference arises because the two APIs go through slightly different code paths:
but if you’d like to investigate the code yourself, the Python bindings for setting the requires_grad attribute directly on the tensor self.requires_grad lives in torch/csrc/autograd/python_variable.cpp.
requires_grad_ on the other hand is a “native function”, i.e., it has a schema defined in native_functions.yaml. This also means that all the python bindings are codegened, so you’d have to clone the repo and actually build to see the code.

6 Likes

requires_grad is a method to check if our tensor tracks gradients.
whereas
requires_grad_ is a method that sets your tensors requires_grad attribute to True