# Problem encountered during backprop: one of the variables needed for gradient computation has been modified by an inplace operation

Hi, I’m trying to finish assignment 4 given in the lecture EECS 498-007. However, there is a problem in my code that causes backprop failure.

Here is my code:

``````def rnn_step_forward(x, prev_h, Wx, Wh, b):
"""
Run the forward pass for a single timestep of a vanilla RNN that uses a tanh
activation function.

The input data has dimension D, the hidden state has dimension H, and we use
a minibatch size of N.

Inputs:
- x: Input data for this timestep, of shape (N, D).
- prev_h: Hidden state from previous timestep, of shape (N, H)
- Wx: Weight matrix for input-to-hidden connections, of shape (D, H)
- Wh: Weight matrix for hidden-to-hidden connections, of shape (H, H)
- b: Biases, of shape (H,)

Returns a tuple of:
- next_h: Next hidden state, of shape (N, H)
- cache: Tuple of values needed for the backward pass.
"""
next_h = torch.tanh(x @ Wx + prev_h @ Wh + b)
cache = (x, prev_h, Wx, Wh, b, next_h)
return next_h, cache

def rnn_forward(x, h0, Wx, Wh, b):
"""
Run a vanilla RNN forward on an entire sequence of data. We assume an input
sequence composed of T vectors, each of dimension D. The RNN uses a hidden
size of H, and we work over a minibatch containing N sequences. After running
the RNN forward, we return the hidden states for all timesteps.

Inputs:
- x: Input data for the entire timeseries, of shape (N, T, D).
- h0: Initial hidden state, of shape (N, H)
- Wx: Weight matrix for input-to-hidden connections, of shape (D, H)
- Wh: Weight matrix for hidden-to-hidden connections, of shape (H, H)
- b: Biases, of shape (H,)

Returns a tuple of:
- h: Hidden states for the entire timeseries, of shape (N, T, H).
- cache: Values needed in the backward pass
"""
N, T, D = x.size()
_, H = h0.size()
h = torch.zeros(N, T, H, dtype=torch.double, device='cuda')
cache = []
h[:, 0, :], c = rnn_step_forward(x[:, 0, :], h0, Wx, Wh, b)
cache.append(c)
for i in range(T - 1):
h[:, i+1, :], c = rnn_step_forward(x[:, i+1, :], h[:, i, :], Wx, Wh, b)
cache.append(c)
return h, cache

N, D, T, H = 2, 3, 10, 5

x = torch.randn(N, T, D, **to_double_cuda, requires_grad=True)
h0 = torch.randn(N, H, **to_double_cuda, requires_grad=True)
Wx = torch.randn(D, H, **to_double_cuda, requires_grad=True)
Wh = torch.randn(H, H, **to_double_cuda, requires_grad=True)

out, cache = rnn_forward(x, h0, Wx, Wh, b)

dout = torch.randn(*out.shape, **to_double_cuda)
out.backward(dout) # the magic happens here!
``````

And the error goes like this:

``````<ipython-input-34-2ebf82a736a2> in <module>
---> 22   out.backward(dout) # the magic happens here!

E:\developer\Anaconda\envs\tensorflow\lib\site-packages\torch\tensor.py in backward(self, gradient, retain_graph, create_graph)
183                 products. Defaults to ``False``.
184         """
186
187     def register_hook(self, hook):

125     Variable._execution_engine.run_backward(
--> 127         allow_unreachable=True)  # allow_unreachable flag
128
129

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.DoubleTensor [2, 5]], which is output 0 of SliceBackward, is at version 10; expected version 9 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
``````

What’s going wrong and how should I debug this?

I modified my code to

``````    N, T, D = x.size()
_, H = h0.size()
h = torch.zeros(N, T, H, dtype=torch.double, device='cuda')
cache = []
tmp, c = rnn_step_forward(x[:, 0, :], h0, Wx, Wh, b)
cache.append(c)
h[:, 0, :] = tmp
for i in range(T - 1):
tmp, c = rnn_step_forward(x[:, i+1, :], tmp, Wx, Wh, b)
h[:, i+1, :] = tmp
cache.append(c)
``````

and it works, but I still don’t know what’s the difference and what’s going wrong : (