RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 50]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead

Jun_Park · July 24, 2023, 5:38am

I’m trying to stack GRUCell but got an error as above. I didn’t use GRU because the input for each sequence comes from the output (actually, modification of ) previous sequence.

class Stacked_GRU_Cells(nn.Module):
    def __init__(self, input_size, hidden_size):
        super(Stacked_GRU_Cells, self).__init__()

        self.hidden_size = hidden_size

        self.gru_0 = nn.GRUCell(input_size, hidden_size)
        self.gru_1 = nn.GRUCell(hidden_size, hidden_size)
        self.out = nn.Linear(hidden_size, input_size)

    def forward(self, x, h_in):
        if h_in is None:
            h_in = torch.zeros(2, x.shape[0], self.hidden_size, device=x.device) # (2, batch_size, input_dim)
        
        h_out = torch.zeros(2, x.shape[0], self.hidden_size, device=x.device)

        h_out[0] = self.gru_0(x, h_in[0])
        h_out[1] = self.gru_1(h_out[0], h_in[1])

        x = self.out(h_out[1])
        return x, h_out
    
def forward_RNN_pass(gru_rnn, input_data, hidden_size):
    batch_size = input_data.size(0)
    seq_len = 2

    # Initialize hidden state
    h = torch.zeros(2, batch_size, hidden_size, device=input_data.device)
    x = input_data

    # Loop over all sequences in batch
    for _ in range(seq_len):
        # Forward pass through GRU layer
        x, h = gru_rnn(x, h)
    return x, h

input_dim = 501
hidden_size = 50
gru_rnn = Stacked_GRU_Cells(input_dim, hidden_size)

# define optimizer and loss
optimizer = optim.Adam(model.parameters(), lr=1e-3)
criterion = nn.MSELoss()

for _ in range(2):
    running_loss = 0.0
    for i, samples in enumerate(train_dataloader):
        x, y = samples
        print('x', x.shape)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        x_hat, _ = forward_RNN_pass(gru_rnn, x, hidden_size)
        loss = criterion(x_hat, x)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

KFrank · July 24, 2023, 4:51pm

Hi Jun!

Assigning into a tensor using indexing is an inplace operation, and this
is likely the cause of your error.

Try replacing the code I quoted with

h_0 = self.gru_0(x, h_in[0])
h_out = torch.stack (h_0, self.gru_1(h_0, h_in[1])

If that doesn’t fix your problem, take a look at this post that gives some
suggestions for debugging inplace-modification errors:

"RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [64, 1]], which is output 0 of AsStridedBackward0, is at version 3; expected version 2 instead. Hint: the backtrace further a autograd

Hi Fahmyadan and Sangyoon! Here are some suggestions about how to track down (and maybe fix) inplace-modification errors. Note that an inplace modification in the forward pass is not necessarily* an error – it depends on whether and how the tensor that was modified is used in the backward pass. Note that inplace operations can be useful for saving memory – if you replace an innocent inplace operation with an out-of-place equivalent, your training will use more memory (and, to a minor e…

Best.

K. Frank

Jun_Park · July 25, 2023, 7:11am

Hi K. Frank,

It solved my problem!
I didn’t know assigning values using tensor indices was an in-place modification.
Thanks a lot.

Jun