Gpu memory cost

as754770178 · May 29, 2020, 8:00am

      out = self.conv1(x)
      out = self.norm1(out)
      out = self.relu(out)

      out = self.conv2(out)
      out = self.norm2(out)
      out = self.relu(out)

      out = self.conv3(out)
      out = self.norm3(out)

training of conv and norm is false
track_running_stats of norm is True

When I use pycharm to debug the code, from conv1 to conv3, the gpu memory not increase. But I exec norm3, the gpu memory increase.
If training of conv and norm is True, I test that I exec every sentence, the gpu memory increase.

When I exec every sentence， that create new out, why the gpu memory not increase?
But why I exec norm3 the gpu memory increase?

as754770178 · May 29, 2020, 8:08am

I see the requires_grad of weight and bias in norm is False.

The code in mmdet/models/backbones/resnet.py of mmdetection.

albanD · May 29, 2020, 2:04pm

Hi,

Keep in mind that the GPU api is asynchronous. So you might want to add a torch.cuda.syncrhonize() after the line to make sure it finished executing.

Also since, you reuse the out variable, the old Tensor that out was pointing to is not reachable anymore and can be deleted when you’re not training (when training, it needs to be kept around to be able to compute the backward).