help!The number of model parameters decreases, but the memory consumption increases

I replaced Conv2D layer with depthwise separable convolution. like this:

def conv_dw(in_channels, out_channels, kernel_size, stride, padding, dilation=1, with_relu=True):
    order_dict = collections.OrderedDict()
    order_dict['conv1'] = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding,
                                    dilation=dilation, groups=in_channels)
    if with_relu:
        order_dict['relu1'] = nn.ReLU(inplace=True)
    order_dict['conv2'] = nn.Conv2d(in_channels, out_channels, 1, 1, 0)
    if with_relu:
        order_dict['relu2'] = nn.ReLU(inplace=True)
    return nn.Sequential(order_dict)

The number of model parameters decreases to 1/4, but the memory consumption increases while training. And I got OOM,What should I do?

The memory consumption of a model during training is actually mostly dominated by the size or number of feature maps, not the network weights (parameters). Replacing one convolution with two may reduce the parameters but roughly doubles the number feature maps. This is normal and doesn’t require a fix.

1 Like