Hi, I was benchmarking different convolutional layers on CPU. It seems that less output channels does not mean faster runtime. This is my test:
def benchmark(model, inp_size=(1,1,28,28), n=1000):
model.eval()
dummy_input = torch.rand(*inp_size)
func = lambda: model(dummy_input)
runtimes = timeit.repeat(func, repeat=10, number=n, )
print(model)
print(min(runtimes))
print()
The result:
Conv2d(1, 64, kernel_size=(7, 7), stride=(1, 1), padding=(1, 1))
0.13669140800000035Conv2d(1, 61, kernel_size=(7, 7), stride=(1, 1), padding=(1, 1))
0.19433658299999967Conv2d(1, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.1496824219999997Conv2d(1, 125, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.2669210919999996Conv2d(1, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.501409099Conv2d(1, 509, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
1.0796660310000021Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.666838663Conv2d(61, 61, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.8703990200000007Conv2d(33, 33, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
0.5053342729999954
Questions:
- Is my benchmarking correct?
- Why is using less channels not faster?
- Would a network compilers like glow accelerate layers with fewer channels?