I just read the paper about MobileNet v1.
I’ve already known the mechanism behind that.
But in pytorch, how can I achieve that base on Conv2d?
should I just use 2 Conv2d to achieve that?
You would normally set the groups
parameter of the Conv2d layer. From the docs:
The configuration when groups == in_channels and out_channels = K * in_channels where K is a positive integer is termed in literature as depthwise convolution.
You can find more information here.
thanks for you information!! but I do not think they’re same things.
From my perspective,
group
means to separate the channels.
For e.g., the input of the image is DFxDFxM, the output is DFxDFxN, the original convolution is: DKxDKxMxN
What I mean Depthwise Separable Convolution
can be divided into 2 parts:
part 1: Depthwise, the convolution of this part is DKxDKx1xM
part 2: Pointwise, the convolution of this part is 1x1xMxN
If the situation is like that, should I just use 2 Conv2d to achieve that?
Does anyone have some ideas?
I believe this answer is a more complete reply to your question.
If groups = nInputPlane, then it is Depthwise. If groups = nInputPlane, kernel=(K, 1), (and before is a Conv2d layer with groups=1 and kernel=(1, K)), then it is separable.
In short, you can achieve it using Conv2d, by setting the groups parameters of your convolutional layers. Hope it helps.
Here is a simple example:
class depthwise_separable_conv(nn.Module):
def __init__(self, nin, nout):
super(depthwise_separable_conv, self).__init__()
self.depthwise = nn.Conv2d(nin, nin, kernel_size=3, padding=1, groups=nin)
self.pointwise = nn.Conv2d(nin, nout, kernel_size=1)
def forward(self, x):
out = self.depthwise(x)
out = self.pointwise(out)
return out
For reference: see numpy implementations of depthwise convolutions vs. grouped convolutions to see how this works.
I am checking the shape of weights in a Conv2d layer, let’s say, the conv para is Conv2d(nin, 4*nin, groups=nin). The shape of weights in this layer is (4*nin, 1, 3, 3). Do you know how to interpret this shape?
(4 kernels for channel 1, 4 kernels for channel 2, …) or (1 kernel for channel 1, 1 kernel for channel 2, …, 1 kernel for channel 1, 1 kernel for channel 2, …)
Thanks,
I think the first understanding is right
Hi, in reply to @shicai 's example, this means you have only 1 kernel per layer, which is unlikely to be helpful.
Maybe something like this:
class depthwise_separable_conv(nn.Module):
def __init__(self, nin, kernels_per_layer, nout):
super(depthwise_separable_conv, self).__init__()
self.depthwise = nn.Conv2d(nin, nin * kernels_per_layer, kernel_size=3, padding=1, groups=nin)
self.pointwise = nn.Conv2d(nin * kernels_per_layer, nout, kernel_size=1)
def forward(self, x):
out = self.depthwise(x)
out = self.pointwise(out)
return out
You can just set group=nin/kernels_per_layer instead of nin*kernels_per_layer.
Can you help me in figuring out how do I modify the Depthwise convolution to full convolution? I have them as follows -
class ConvBNReLU(nn.Sequential):
def __init__(self, in_planes, out_planes, kernel_size, stride=1, groups=1):
padding = (kernel_size - 1) // 2
super(ConvBNReLU, self).__init__(
nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
nn.BatchNorm2d(out_planes),
nn.ReLU6(inplace=True)
)
class InvertedResidual(nn.Module):
def __init__(self, inp, oup, kernel, stride, expand_ratio, res_connect):
super(InvertedResidual, self).__init__()
self.stride = stride
assert stride in [1, 2]
hidden_dim = int(round(inp * expand_ratio))
self.use_res_connect = res_connect
layers = []
if expand_ratio != 1:
# pw
layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
layers.extend([
# depth-wise
ConvBNReLU(hidden_dim, hidden_dim, kernel_size=kernel, stride=stride, groups=hidden_dim),
# pw-linear
nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
nn.BatchNorm2d(oup),
])
self.conv = nn.Sequential(*layers)
def forward(self, x):
if self.use_res_connect:
return x + self.conv(x)
else:
return self.conv(x)
I’m not sure if I understand the question correctly, but if you don’t want to use the depthwise comvolution but a “vanilla” convolution, you could remove the groups
argument or set it to 1
.