I get the following error:
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 1, 6]], which is output 0 of TanhBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
My forward functions is:
def forward(self, x: torch.Tensor):
# x: [Batch_size, seq_len, input_size]
batch_size = x.size(0)
# Flatten the input across the sequence and input size
x = x.reshape(batch_size, -1) # x: [Batch_size, seq_len * input_size]
# Apply the linear layer
x = self.Linear(x) # x: [Batch_size, pred_len * predict_size]
# Reshape the output to [Batch_size, pred_len, predict_size]
x = x.reshape(batch_size, self.pred_len, self.predict_size)
# Apply activation function if specified
x = self.activation(x)
return x # [Batch_size, pred_len, predict_size]
Using view() instead of reshape() doesn’t help.
self.activation can be any: relu, tanh,… If I remove x = self.activation(x) the code works fine.
Could you please advise what’s wrong here?
Most likely, the output of forward() is being modified inplace after forward() has run.
self.activation(x) isn’t causing the inplace modification error, per se.
Rather, the presence of self.activation (x) in the computation graph
is causing an inplace modification – that exists with or without the call
to self.activation (x) – to matter.
There is a reasonable chance that adding a .clone(), specifically x = self.activation (x).clone(), will fix your problem.
Sometimes these errors are a symptom of something incorrect or
sub-optimal in what you are doing. If you want to track down the root
cause, take a look at the debugging techniques in the following post:
Your suggestion solves the problem. Perhaps, a better way, which also does the trick it to swap two lines:
# Apply activation function if specified
x = self.activation(x)
# Reshape the output to [Batch_size, pred_len, predict_size]
x = x.reshape(batch_size, self.pred_len, self.predict_size)