I found that there are two ways to change Tensor.requires_grad。
I can manually set x.requires_grad = flag or I can call the method x.requires_grad_(flag). I did a simple experiment, and found nothing difference.
In [14]: a = torch.tensor([2.0])
In [15]: a.requires_grad
Out[15]: False
In [16]: a.requires_grad_()
Out[16]: tensor([2.], requires_grad=True)
In [17]: a.requires_grad
Out[17]: True
In [18]: a.requires_grad = False
In [19]: a
Out[19]: tensor([2.])
I’m wondering that is there any detail difference between the aforementioned two ways? Dose the requires_grad_(flag) do any other things except self.requires_grad = flag?
I tried to read the source code of requires_grad_, while found nothing, it seems to be written in C code~~
There are some subtle differences between the two:
For example if you have a non-leaf tensor, setting it to True using self.requires_grad=True will produce an error, but not when you do requires_grad_(True).
Both perform some error checking, such as verifying that the tensor is a leaf, before calling into the same set_requires_grad function (implemented in cpp). But the exact checks done, and error messages produced are not identical:
>>> a = torch.tensor(1., requires_grad=True)
>>> b = a + 1
>>> b.requires_grad_(True)
tensor(2., grad_fn=<AddBackward0>)
>>> b.requires_grad_(False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> b.requires_grad = True
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables.
>>> b.requires_grad = False
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>>
The difference arises because the two APIs go through slightly different code paths:
but if you’d like to investigate the code yourself, the Python bindings for setting the requires_grad attribute directly on the tensor self.requires_grad lives in torch/csrc/autograd/python_variable.cpp. requires_grad_ on the other hand is a “native function”, i.e., it has a schema defined in native_functions.yaml. This also means that all the python bindings are codegened, so you’d have to clone the repo and actually build to see the code.
requires_grad is a method to check if our tensor tracks gradients.
whereas requires_grad_ is a method that sets your tensors requires_grad attribute to True