I have 2D image with lots (houndreds) of channals.
Nearby channels are very correlated.
For now i’m using entry group with several Conv2D layers with kernel size = ( 1, 1 ).
It’s working ok.
But i assume, that doing 1d-convolution in channel axis, before spatial 2d convolutions
allows me to create smaller and more accurate model.
I’ve created this straightforward wrapper,
for converting (N, C, H, W) layout to (N, 1, C) layout (which, is capable for Conv1d),
and backwards:
class InChannelConv(nn.Module):
def __init__(self, body):
super().__init__()
self.body = body
def forward(self, x):
n2, c2, h2, w2 = x.size()
x = x.permute(0, 2, 3, 1).view(n2*h2*w2, 1, c2).contiguous()
x = self.body(x)
x = x.view(n2, h2, w2, -1).permute(0, 3, 1, 2).contiguous()
return x
And using my wrapper like this:
ChannelConv(nn.Sequential(
nn.Conv1d(1, 32, 7), nn.ReLU(),
nn.Conv1d(32, 64, 3), nn.ReLU(),
nn.Conv1d(32, 1, 1), nn.ReLU(),
))
...
That gives me OOM, even with very small body.
When i use useless-small body like Conv1d(1, 1, 1), i’ve got:
RuntimeError: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
I know about nn.PixelShuffle
, but it not exactly what i want.
It mixes spatial and “in-channel” convolution,
moreover i can’t make any assumptions about channels count.
So there is a question:
how to make efficient 1D-convolution in channels axis (independent for every pixel),
having lots-of-channels 2D image on input and output?