I have an issue with time to be taken when I use group convolution
I compared time with normal convolution which have same number of parameters and it takes much shorter time
Test code is as follows :
% dataloader is given. dataloader provides two batch data of (32,12,50,50) shape loader = enumerate(dataloader) net1 = torch.nn.Conv2d(12, 300, kernel_size=5, stride=1, padding=(2,2), bias=False) % normal convolution net2 = torch.nn.Conv2d(12, 12*300, groups=12, kernel_size=5, stride=1, padding=(2,2), bias=False) % group convolution net1.cuda() net2.cuda() % compare normal conv and group conv for _ in range(3): _, (a,b) = loader.__next__() a = a.cuda() torch.cuda.synchronize() start = time.time() o = net1(a) % line** torch.cuda.synchronize() end = time.time()
I recorded a time to take for convolution operation and repeat again replacing line** to “o = net2(a)”
Following is when I used normal convolution
iter 0 : 0.12s
iter 1 : 0.00536s
iter 2 : 0.00494s
And then I replaced net1->net2 i.e., normal conv to group conv
iter 0 : 0.2497s
iter 1 : 0.2162s
iter 2 : 0.20505s
I guess that net1 and net2 have the same number of weight parameters and they give exactly the same result when reshape and addition over input channel are performed to group convolution result. However time cost is so different. Especially, time for normal convolution dramatically reduces after one iteration while not for group convolution
Can I run group convolution as fast as normal convolution??
I need group convolution to implement following custom layer
X1…Xn (Xi is i-th channel) --> Y1 … Ym (Yj is j-th channel)
where Yj = \Sigma_i (Xi * Kij) @ (Xi * L)
Kij is kernel, * is convolution, @ is element-wise multiplication and L is another fixed kernel.
To implement this I used group convolution for two convolution and then elementwise-multipy and add over one channel. But it is too slow.