Inplace error in LSTM backward pass

Alper_YILDIZ · May 11, 2020, 8:45am

Hi everyone,

I am trying to run LSTM network, it may be used after convolutional layer so I reshape incoming tensor from [ batch_size, in_channels, window_length ] to [ batch_size, window_length, in_channels ] but I got inplace error as given below.

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [800, 201]] is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

I pass layer parameters and batch size to the LST object. I see what is the problem, but couldn’t find any way to make it work.

class LST(nn.Module):
    def __init__(self, param, BS ):
        super().__init__()
        self.batch_size = int(BS)
        self.init_hidden_states(param)
        self.layer = nn.LSTM( *param )


    def init_hidden_states(self,param):
        if param[6] == True:
            NUM_OF_DIRECTIONS = 2
        else:
            NUM_OF_DIRECTIONS = 1
        self.hidden = (torch.randn(param[2] * NUM_OF_DIRECTIONS, self.batch_size, param[1]).to('cuda'),
                       torch.randn(param[2] * NUM_OF_DIRECTIONS, self.batch_size, param[1]).to('cuda'))      


    def forward(self, x ):            
        
        batchsize , features , windowlength = x.shape
        y = x.contiguous().view(batchsize, windowlength, features)
        z, self.hidden = self.layer( y, self.hidden )
        batchsize, windowlength, features = z.shape
        t = z.contiguous().view(batchsize, features, windowlength)
        return t

Alper_YILDIZ · May 11, 2020, 8:47am

I see the problem here is hidden states that inplaced with layer operation, can anyone suggest where to detach hidden states and how to push them back for next batch operation?