Autograd does not allow backpropagation through tensors with their
_version attribute greater than 0. However, when training a linear model, the
_version of the learnable parameters do change each time the
step() function of the optimizer is called, and this normally does not produce any errors. Are exceptions made for
nn.Parameter instances? Or for leafs in the computational graph? Or is the mechanism even something else?
Take the following example:
class MyModule(nn.Module): def __init__(self): super().__init__() self.my_param = nn.Parameter(torch.tensor([1.0], requires_grad=True)) def forward(self, x: torch.Tensor) -> torch.Tensor: return x * self.my_param mm = MyModule() optim = torch.optim.Adam(mm.parameters()) for epoch in range(2): optim.zero_grad() loss = mm(torch.tensor([2.0])) loss.backward() optim.step() print(mm.my_param._version) print(mm(torch.tensor([2.0]))._version) print(loss._version)
This gives the following output:
2 0 0
_version attribute of
my_param does not stay 0.
The context in which I run into problems is as follows: I am more or less simulating particles using Newtonian mechanics. I want to train the initial positions, initial velocities, initial accelerations and the masses of the particles such that they end up at a specific position. Currently it works for one epoch, but the second epoch I get errors that some
_version is not 0, and
torch.autograd.set_detect_anomaly(True) did unfortunately not produce useful help. But it is unclear to me which variables are allowed to change in-place and which not. If only
nn.Parameters are allowed to change in-place, then I can start hunting for a variable that is neither a parameter nor reset at the start of the second epoch. But currently I do not know when in-place operations are allowed, so I find it hard to tell what may go wrong.