help！The number of model parameters decreases, but the memory consumption increases

Q_lee · November 28, 2019, 12:27pm

I replaced Conv2D layer with depthwise separable convolution. like this:

def conv_dw(in_channels, out_channels, kernel_size, stride, padding, dilation=1, with_relu=True):
    order_dict = collections.OrderedDict()
    order_dict['conv1'] = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding,
                                    dilation=dilation, groups=in_channels)
    if with_relu:
        order_dict['relu1'] = nn.ReLU(inplace=True)
    order_dict['conv2'] = nn.Conv2d(in_channels, out_channels, 1, 1, 0)
    if with_relu:
        order_dict['relu2'] = nn.ReLU(inplace=True)
    return nn.Sequential(order_dict)

The number of model parameters decreases to 1/4, but the memory consumption increases while training. And I got OOM,What should I do?

aauker · November 28, 2019, 6:10pm

The memory consumption of a model during training is actually mostly dominated by the size or number of feature maps, not the network weights (parameters). Replacing one convolution with two may reduce the parameters but roughly doubles the number feature maps. This is normal and doesn’t require a fix.