Hi, I am having a memory issue and I’m not sure how to solve it. The issue goes as follows:
RuntimeError: CUDA out of memory. Tried to allocate 28.62 MiB (GPU 0; 11.17 GiB total capacity; 10.38 GiB already allocated; 27.62 MiB free; 14.18 MiB cached)
My code is a DRQN agent doing 3 convolutions and passing through an LSTM layer in the forward pass with an unroll loop. I guess there are too many intermediate results which are clogging up the memory although I’m not sure how to fix it . Can anyone help?
Code:
class DRQNBody(nn.Module):
def __init__(self, in_channels=4):
super(DRQNBody, self).__init__()
self.feature_dim = 512
self.rnn_input_dim = 7*7*64
self.batch_size = 1
self.unroll = 10
in_channels = 1 # for 1 frame input
self.conv1 = layer_init(nn.Conv2d(in_channels, 32, kernel_size=8, stride=4))
self.conv2 = layer_init(nn.Conv2d(32, 64, kernel_size=4, stride=2))
self.conv3 = layer_init(nn.Conv2d(64, 64, kernel_size=3, stride=1))
self.lstm = nn.LSTM(self.rnn_input_dim, self.feature_dim , num_layers = 1)
self.hidden = self.init_hidden()
def forward(self, x):
ycat = torch.Tensor()
xchunks= torch.chunk(x,self.unroll, 1)
for ts in range(len(xchunks)):
y = F.relu(self.conv1(xchunks[ts]))
y = F.relu(self.conv2(y))
y = F.relu(self.conv3(y))
y = y.view(y.size(0), -1) # flattening
ycat = y.view(-1, 1, self.rnn_input_dim) # Adding dimention
output, self.hidden = self.lstm(ycat, self.hidden)#output_chunks[yt], self.hidden)
y = torch.squeeze(output,1)
return y