There are some subtle differences between the two:
For example if you have a non-leaf tensor, setting it to True using self.requires_grad=True will produce an error, but not when you do requires_grad_(True).
Both perform some error checking, such as verifying that the tensor is a leaf, before calling into the same set_requires_grad function (implemented in cpp). But the exact checks done, and error messages produced are not identical:
>>> a = torch.tensor(1., requires_grad=True)
>>> b = a + 1
>>> b.requires_grad_(True)
tensor(2., grad_fn=<AddBackward0>)
>>> b.requires_grad_(False)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> b.requires_grad = True
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables.
>>> b.requires_grad = False
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>>
The difference arises because the two APIs go through slightly different code paths:
but if you’d like to investigate the code yourself, the Python bindings for setting the requires_grad attribute directly on the tensor self.requires_grad lives in torch/csrc/autograd/python_variable.cpp.
requires_grad_ on the other hand is a “native function”, i.e., it has a schema defined in native_functions.yaml. This also means that all the python bindings are codegened, so you’d have to clone the repo and actually build to see the code.