Broadcasting the depthwise convolution kernel

I am trying to do depthwise convolution with a fixed kernel. My kernel is 2D, and I want to use the same 2D kernel for all channles. This is what I am currently doing(code modified from antialiased-cnns):

x = torch.randn([32, 256, 64, 64])

lpf = torch.tensor([[1., 2., 1.],
                    [2., 4., 2.],
                    [1., 2., 1.]])
lpf = lpf/torch.sum(lpf)
lpf = lpf.unsqueeze(dim=0).unsqueeze(dim=0)
lpf = lpf.repeat([self.channels, 1, 1, 1])
y = F.conv2d(input=x, weight=self.lpf, padding=1, groups=self.channels)

While this works correctly, I was wondering if I could somehow get rid of the lpf.repeat() command. Is there a way to broadcast the operation that doesn’t reqiure explicit .repeat() operation ?

You could use expand(256, -1, -1, -1) instead of repeat.
Not however, that you would only save ~9kB of memory, since:

print(lpf.nelement() * 4 / 1024)
> 9.0
2 Likes