Runtime Error Cuda out of Memory: LSTM

Hi, I am having a memory issue and I’m not sure how to solve it. The issue goes as follows:

RuntimeError: CUDA out of memory. Tried to allocate 28.62 MiB (GPU 0; 11.17 GiB total capacity; 10.38 GiB already allocated; 27.62 MiB free; 14.18 MiB cached)

My code is a DRQN agent doing 3 convolutions and passing through an LSTM layer in the forward pass with an unroll loop. I guess there are too many intermediate results which are clogging up the memory although I’m not sure how to fix it . Can anyone help?

Code:

class DRQNBody(nn.Module):
    def __init__(self, in_channels=4):
        super(DRQNBody, self).__init__()
        self.feature_dim = 512
        self.rnn_input_dim = 7*7*64
        self.batch_size = 1
        self.unroll  = 10 
        in_channels = 1 # for 1 frame input
        self.conv1 = layer_init(nn.Conv2d(in_channels, 32, kernel_size=8, stride=4))
        self.conv2 = layer_init(nn.Conv2d(32, 64, kernel_size=4, stride=2))
        self.conv3 = layer_init(nn.Conv2d(64, 64, kernel_size=3, stride=1))
        self.lstm = nn.LSTM(self.rnn_input_dim, self.feature_dim , num_layers = 1)
        self.hidden = self.init_hidden()

def forward(self, x):
    ycat = torch.Tensor()
    xchunks= torch.chunk(x,self.unroll, 1)
    for ts in range(len(xchunks)):
        y = F.relu(self.conv1(xchunks[ts]))
        y = F.relu(self.conv2(y))
        y = F.relu(self.conv3(y))
        y = y.view(y.size(0), -1) # flattening
        ycat = y.view(-1, 1, self.rnn_input_dim)   # Adding dimention 
        output, self.hidden = self.lstm(ycat, self.hidden)#output_chunks[yt], self.hidden)
    y = torch.squeeze(output,1)
    return y

Assuming you are running this on jupyter kernel, try to restart the system or kernel and see if the error persists. Generally this error comes up when you have previously moved some data to GPU and that data remains on GPU as you have to do manual garbage collection when using pytorch for this issue.

You can refer to this fastai tutorial for some references link

You could also try

torch.cuda.empty_cache()

if Kushajveer Singh suggestion did not do the trick for you.