One of the variables needed for gradient computation has been modified

I try to use of eager execution having original nodes with parameters.
For training, I code with multiple iterations for training, so I did option of;

loss.backward(retain_graph = True)

Then I meet error of;

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-18-80ae8e772693> in <module>()
     19       y = model()
     20       loss = criterion(y, t)
---> 21       loss.backward(retain_graph = True)
     22       optimizer.step()

1 frames
/usr/local/lib/python3.6/dist-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    105                 products. Defaults to ``False``.
    106         """
--> 107         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    108 
    109     def register_hook(self, hook):

/usr/local/lib/python3.6/dist-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     91     Variable._execution_engine.run_backward(
     92         tensors, grad_tensors, retain_graph, create_graph,
---> 93         allow_unreachable=True)  # allow_unreachable flag
     94 
     95 

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [28, 128]] is at version 25088; expected version 21504 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Does this mean of that path for back-propagation is disconnected (cannot do it)?

Training code is as follows;

model.train()
  
for epoch in range(EPOCH):
  for x, t in dataloader_train:
    t = t.to(device)
    for time in range(TIME_STEPS):

      x_ = x[0][0][time]
      x_ = x_.to(device)
      for index_a in range(NUM_INPUT):
        for index_b in range(NUM_HIDDEN-1, 0, -1):
          model.fw_x[index_a][index_b] = model.fw_x[index_a][index_b - 1]
          model.fw_h[index_a][index_b] = model.fw_h[index_a][index_b - 1]
      
        model.fw_x[index_a][0:NUM_INPUT] = x_
        model.fw_h[index_a][0] = 0.0

      model.zero_grad()
      y = model()
      loss = criterion(y, t)
      loss.backward(retain_graph = True)
      optimizer.step()

Is such the FIFO coding not allowed? Then how to write same function with PyTorch
grammer?

Hi,

This writing Tensors inplace can be problematic.
If you can use lists instead, it will solve the problem.
Otherwise, you need to avoid problematic inplace operations either by creating new Tensors every time or adding a clone() of fw_x/fw_h after all your inplace ops.

1 Like

Hi,

I try this style;

model.fw_x = torch.stack((model.fw_x[1:], x_))

where “x_” and “fw_x” are one and two dimension, repectively.
torch.cat() needs same shape between them, so instead I used the stack with “[1:]” in order to make FIFO function, but I meet error of;

RuntimeError: invalid argument 0: Tensors must have same number of dimensions: got 3 and 2 at /pytorch/aten/src/TH/generic/THTensor.cpp:702

I still do not understand. Any suggestion?