How to modify a Conv2d to Depthwise Separable Convolution?

I just read the paper about MobileNet v1.
I’ve already known the mechanism behind that.
But in pytorch, how can I achieve that base on Conv2d?

should I just use 2 Conv2d to achieve that?

You would normally set the groups parameter of the Conv2d layer. From the docs:

The configuration when groups == in_channels and out_channels = K * in_channels where K is a positive integer is termed in literature as depthwise convolution.

You can find more information here.

7 Likes

thanks for you information!! but I do not think they’re same things.
From my perspective,
group means to separate the channels.
For e.g., the input of the image is DFxDFxM, the output is DFxDFxN, the original convolution is: DKxDKxMxN
What I mean Depthwise Separable Convolution can be divided into 2 parts:
part 1: Depthwise, the convolution of this part is DKxDKx1xM
part 2: Pointwise, the convolution of this part is 1x1xMxN

If the situation is like that, should I just use 2 Conv2d to achieve that?

3 Likes

Does anyone have some ideas?

I believe this answer is a more complete reply to your question.

If groups = nInputPlane, then it is Depthwise. If groups = nInputPlane, kernel=(K, 1), (and before is a Conv2d layer with groups=1 and kernel=(1, K)), then it is separable.

In short, you can achieve it using Conv2d, by setting the groups parameters of your convolutional layers. Hope it helps.

2 Likes

Here is a simple example:

class depthwise_separable_conv(nn.Module):
    def __init__(self, nin, nout):
        super(depthwise_separable_conv, self).__init__()
        self.depthwise = nn.Conv2d(nin, nin, kernel_size=3, padding=1, groups=nin)
        self.pointwise = nn.Conv2d(nin, nout, kernel_size=1)

    def forward(self, x):
        out = self.depthwise(x)
        out = self.pointwise(out)
        return out
32 Likes

For reference: see numpy implementations of depthwise convolutions vs. grouped convolutions to see how this works.

I am checking the shape of weights in a Conv2d layer, let’s say, the conv para is Conv2d(nin, 4*nin, groups=nin). The shape of weights in this layer is (4*nin, 1, 3, 3). Do you know how to interpret this shape?
(4 kernels for channel 1, 4 kernels for channel 2, …) or (1 kernel for channel 1, 1 kernel for channel 2, …, 1 kernel for channel 1, 1 kernel for channel 2, …)

Thanks,

I think the first understanding is right

Hi, in reply to @shicai 's example, this means you have only 1 kernel per layer, which is unlikely to be helpful.

Maybe something like this:

class depthwise_separable_conv(nn.Module):
    def __init__(self, nin, kernels_per_layer, nout):
        super(depthwise_separable_conv, self).__init__()
        self.depthwise = nn.Conv2d(nin, nin * kernels_per_layer, kernel_size=3, padding=1, groups=nin)
        self.pointwise = nn.Conv2d(nin * kernels_per_layer, nout, kernel_size=1)

    def forward(self, x):
        out = self.depthwise(x)
        out = self.pointwise(out)
        return out
15 Likes

You can just set group=nin/kernels_per_layer instead of nin*kernels_per_layer.

Can you help me in figuring out how do I modify the Depthwise convolution to full convolution? I have them as follows -

class ConvBNReLU(nn.Sequential):
    def __init__(self, in_planes, out_planes, kernel_size, stride=1, groups=1):
        padding = (kernel_size - 1) // 2
        super(ConvBNReLU, self).__init__(
            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
            nn.BatchNorm2d(out_planes),
            nn.ReLU6(inplace=True)
        )

class InvertedResidual(nn.Module):
    def __init__(self, inp, oup, kernel, stride, expand_ratio, res_connect):
        super(InvertedResidual, self).__init__()
        self.stride = stride
        assert stride in [1, 2]
        hidden_dim = int(round(inp * expand_ratio))
        self.use_res_connect = res_connect
        layers = []
        if expand_ratio != 1:
            # pw
            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
        layers.extend([
            # depth-wise
            ConvBNReLU(hidden_dim, hidden_dim, kernel_size=kernel, stride=stride, groups=hidden_dim),
            # pw-linear
            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
            nn.BatchNorm2d(oup),
        ])
        self.conv = nn.Sequential(*layers)

    def forward(self, x):
        if self.use_res_connect:
            return x + self.conv(x)
        else:
            return self.conv(x)

I’m not sure if I understand the question correctly, but if you don’t want to use the depthwise comvolution but a “vanilla” convolution, you could remove the groups argument or set it to 1.

1 Like