Gradient computation sees a variable modified but it doesn't seem o

I get the following error:
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [64, 1, 6]], which is output 0 of TanhBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

My forward functions is:

def forward(self, x: torch.Tensor):
        # x: [Batch_size, seq_len, input_size]
        batch_size = x.size(0)
        
        # Flatten the input across the sequence and input size
        x = x.reshape(batch_size, -1) # x: [Batch_size, seq_len * input_size]
        
        # Apply the linear layer
        x = self.Linear(x) # x: [Batch_size, pred_len * predict_size]
        
        # Reshape the output to [Batch_size, pred_len, predict_size]
        x = x.reshape(batch_size, self.pred_len, self.predict_size)

        # Apply activation function if specified
        x = self.activation(x)
        
        return x # [Batch_size, pred_len, predict_size]

self.Linear is defined as

self.Linear = nn.Linear(self.seq_len * self.input_size, self.pred_len * self.predict_size)

Using view() instead of reshape() doesn’t help.
self.activation can be any: relu, tanh,… If I remove x = self.activation(x) the code works fine.
Could you please advise what’s wrong here?

Hi iftg!

Most likely, the output of forward() is being modified inplace after
forward() has run.

self.activation(x) isn’t causing the inplace modification error, per se.
Rather, the presence of self.activation (x) in the computation graph
is causing an inplace modification – that exists with or without the call
to self.activation (x) – to matter.

There is a reasonable chance that adding a .clone(), specifically
x = self.activation (x).clone(), will fix your problem.

Or ask pytorch to sweep this inplace modification error under the rug
for you.

Sometimes these errors are a symptom of something incorrect or
sub-optimal in what you are doing. If you want to track down the root
cause, take a look at the debugging techniques in the following post:

Best.

K. Frank

Your suggestion solves the problem. Perhaps, a better way, which also does the trick it to swap two lines:

        # Apply activation function if specified
        x = self.activation(x)

        # Reshape the output to [Batch_size, pred_len, predict_size]
        x = x.reshape(batch_size, self.pred_len, self.predict_size)