The convolutional layers (e.g. nn.Conv2d) require
groups to divide both
out_channels. The functional convolutions (e.g. nn.functional.conv2d) only require
groups to divide
This leads to confusing behavior:
import numpy as np import torch from torch import nn from torch.nn import functional as F # testing batch_size = 1 w_img = 1 h_img = 1 c_in = 6 c_out = 9 filter_len = 1 groups = 3 image = np.arange(6, dtype=np.float32).reshape(batch_size, c_in, h_img, w_img) filters = np.empty((c_out, c_in // groups, filter_len, filter_len), dtype=np.float32) filters.fill(0.5) image = torch.tensor(image) filters = torch.tensor(filters) features_functional = F.conv2d(image, filters, padding=filter_len // 2, groups=groups) print(features_functional.shape) # 9 layer = nn.Conv2d(c_in, c_out, filter_len, padding=filter_len // 2, groups=groups) print(layer.out_channels) # 9
Here both forms have 9
groups to 2, however results in 8
out_channels from the functional form and an exception thrown from the other.
I see two problems with this:
- Inconsistency (despite the fact that both are documented correctly).
- The functional form opaquely rounds
out_channelsdown to the nearest integer that is divisible by
groups. This is a non-obvious process for the user.
Is there a reason for this difference? If not, it seems like the functional form should throw a similar error.