Number of parameter in depthwise separable convolution (Xception)

jtang10 · March 14, 2018, 9:04pm

Hello there,

I am recently studying the Xception paper and came across the depthwise separable convolution (DW conv). I think I understand how it works and how it implemented in Pytorch, but I don’t understand the number of parameters in that layers.

For example, I have a convolution layer (no bias) with in_channels = 16, out_channels = 32 and kernel_size = 3. For traditional convolution it should have 16x32x3x3 = 4608 parameters and for DW conv (which set group=in_channels per pytorch implementation), it has 16x3x3 + 16x1x1x32 = 656 parameters.

I printed the parameters in the conv layer by using parameters() function and verified the number of traditional conv layer but DW conv has a weight with size of 32x1x3x3, which differs from the Xception paper. It appears that the second term (1 in the example) is in_channels / group.

Can someone help me explain how this is implemented or did I do something wrong? Thank you very much for the help!

jtang10 · March 15, 2018, 2:15am

Try to answer my own question here after I read some posts again. Using groups can only lead to depthwise convolution, so setting, groups = in_channels = out_channels will fulfill the first part of the depthwise separable convolution, then we can manually add another 1x1 convolution to fulfill the second part based on Xception paper.

In my own conclusion, this is more like a “grouped convolution” like in this post rather than the depthwise separable convolution or the “depthwise convolution”, the 1st step of the depthwise separable convolution (before the pointwise convolution).