The pytorch docs for the groups
parameter of nn.Conv2d state that:
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,
At groups=1, all inputs are convolved to all outputs.
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
At groups= in_channels, each input channel is convolved with its own set of filters, of size: in_channels / out_channels
However, this description seems inconsistent with the behaviour of nn.Conv2d in reality.
for example:
import torch
import torch.nn as nn
conv_layer = nn.Conv2d(16, 16, 1, groups=2, bias=False)
conv_layer.weight.shape
Returns torch.Size([16, 8, 1, 1])
But based on my interpretation of :
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
Shouldn’t there be two weights, each of size [8, 8, 1, 1]
?
These inconsistencies carry over to other values for the groups
parameter, in my view.
I must be missing something - could someone please clarify?