CRNN: one of the variables needed for gradient computation has been modified by an inplace operation

erikt516 · November 13, 2020, 1:09am

I am trying to create a CRNN that will feed the output of the CNN to an LSTM layer. However, I am running into an error that there is an inplace error somewhere and I don’t know where. Here is my error message and model “one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 64]] is at version 2; expected version 1 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient.”

class CNN_RNN(nn.Module):

def __init__(self):
    super(CNN_RNN, self).__init__()
    self.conv1 = nn.Conv2d(3,16,3)
    self.conv2 = nn.Conv2d(16,32,3)
    self.conv3 = nn.Conv2d(32,64,3)
    self.conv4 = nn.Conv2d(64,128,3)
    self.conv5 = nn.Conv2d(128,64,3)
    self.fc = nn.Linear(128, 16) # (hidden_size, input_size)
    self.pool1 = nn.MaxPool2d(2, 2)
    self.pool2 = nn.MaxPool2d(4, 4)
    self.lstm = nn.LSTM(input_size = 64, hidden_size=64, bidirectional=True, batch_first=True)


def forward(self, x, hidden):
    cnn = self.pool1(F.relu(self.conv1(x), inplace=False))
    cnn = self.pool1(F.relu(self.conv2(cnn), inplace=False))
    cnn = self.pool1(F.relu(self.conv3(cnn), inplace=False))
    cnn = self.pool2(F.relu(self.conv4(cnn), inplace=False))
    cnn = self.pool2(F.relu(self.conv5(cnn), inplace=False)) # outputs (batch, filters, width, height)

    cnn = torch.squeeze(cnn)
    cnn = torch.unsqueeze(cnn, 1)
    lstm, hidden = self.lstm(cnn, hidden)
    lstm = torch.squeeze(lstm)
    out = self.fc(lstm)
    out = F.softmax(out).clone()
    return out, hidden

model = CNN_RNN()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(max_epochs):
for i, data in enumerate(dataloaders[“train”], 0):
inputs, labels = data
optimizer.zero_grad()
outputs, hidden = model(inputs, hidden)
loss = criterion(outputs, labels)
loss.backward(retain_graph=True)
optimizer.step()

ptrblck · November 15, 2020, 8:38am

I think the backward call might be raising this issue, since you are using retain_graph=True while also updating the parameters, which could be a similar issue to this one.