Inplace operation error with sliding window data loader

Hi.

I’m training self-supervision RNN with time-series data and sliding window.
But an inplace operation error occurs when training model…

the following is my code.

import torch
import numpy as np
from torch.utils.data import DataLoader
from torch import nn

data = torch.randn(100,9)

class TimeseriesDataset(torch.utils.data.Dataset):   
    def __init__(self, X, y, seq_len=1):
        self.X = X
        self.y = y
        self.seq_len = seq_len

    def __len__(self):
        return self.X.__len__() - (self.seq_len-1)

    def __getitem__(self, index):
        return (self.X[index:index+self.seq_len], self.y[index:index+self.seq_len])
    
data = torch.tensor(np.array(data), dtype = torch.float32)

train_dataset = TimeseriesDataset(data[:-1], data[1:], seq_len=5)
train_loader = DataLoader(train_dataset, batch_size = 1, shuffle = False)

input_size = 9
hidden_size = 9
num_layers = 3

rnn = nn.RNN(input_size= input_size, hidden_size = hidden_size, num_layers = num_layers, batch_first = True)

num_epochs = 5
learning_rate = 0.01

optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate)

criterion = nn.CrossEntropyLoss()

hidden = None

for epoch in range(num_epochs) :
  for i, d in enumerate(train_loader) :        
      out, hidden = rnn.forward(d[0], hidden)

      loss = criterion(d[1], out)
      
      optimizer.zero_grad()
      loss.backward(retain_graph=True)
      optimizer.step()
      
      print(loss.item())

resullt in the following error

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [9, 9]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Hi,
I’m not very sure about why the error is occurring, but do you have a reason to use retain_graph=True in the backward call?

It sometimes causes the error that you are facing.

If loss.backward function is executed multiple times in one epoch, retain_graph option have to be True as i know…

if I run that code without retain_graph option, result in the following error… T T

Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

I found cause of this problem…!