I replaced Conv2D layer with depthwise separable convolution. like this:
def conv_dw(in_channels, out_channels, kernel_size, stride, padding, dilation=1, with_relu=True):
order_dict = collections.OrderedDict()
order_dict['conv1'] = nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding,
dilation=dilation, groups=in_channels)
if with_relu:
order_dict['relu1'] = nn.ReLU(inplace=True)
order_dict['conv2'] = nn.Conv2d(in_channels, out_channels, 1, 1, 0)
if with_relu:
order_dict['relu2'] = nn.ReLU(inplace=True)
return nn.Sequential(order_dict)
The number of model parameters decreases to 1/4, but the memory consumption increases while training. And I got OOM,What should I do?