I am struggling to fit my model in a 16GB gpu due to cuda out of memory error. What is even more intriguing is that the model runs fine for roughly first 2000 steps and the memory allocated as per nvidia-smi starts increasing from 14GB to 16GB gradually and then crash finally. I have lot of tensors declared in the
forward function using method
new_zeros which I suspect are not getting dereferenced or freed from memory and that’s why the accumulation from 14GB to 16GB is happening. Here is a dummy code
class Test(nn.Module): def __init__(self): super(Test, self).__init__() self.weights = nn.Parameter(torch.zeros(5,5)) def forward(self, x): dummy_constant = x.new_ones(self.weights.shape, x.shape) output = self.weights @ x output += dummy_constant return output
model = Test() for i in range(1,100): x = torch.rand(5,i) out = model(x) #loss.backward() and other stuff
So all in all, will every instance of dummy_constant stay in memory even when it goes out of scope?