How to reduce the overhead of conv layers?

cwan · June 5, 2019, 6:46pm

I’m trying to optimize the inference latency, but found that there’s a strange overhead happens in one single conv layer. Even though it still fits y=kx+b relationship, but the b is unacceptable large. How should I reduce such b? Thanks in advance!

The profiling result goes as follows.

[filter, filter, input_channels, output_channels]	Latency
[3, 3, 16, 8]	0.024658
[3, 3, 16, 16]	0.032011
[3, 3, 16, 32]	0.031948
[3, 3, 16, 64]	0.037025
[3, 3, 16, 128]	0.049538
[3, 3, 16, 256]	0.062251
[3, 3, 16, 512]	0.105888

The code for profiling goes as follows:

for i in range(7):
  shape = [3,3,16,2**(i+3)]
  kernel_value = np.random.rand(shape[0],shape[1],shape[2],shape[3]).astype(np.float32)
  kernel = torch.as_tensor(np.transpose(kernel_value, (3,2,0,1)))
  input_value = np.random.rand(1,16,32,32).astype(np.float32)
  x = torch.as_tensor(input_value)

  before = datetime.datetime.now()
  for j in range(100):
    if j==50:
      before = datetime.datetime.now()
    tmp = F.conv2d(x, weight=kernel,bias=None,stride=1,padding=(3-1)//2)
  after = datetime.datetime.now()
  interval = after-before
  print(str(shape)+"\t"+str(get_seconds(interval)))

cwan · June 6, 2019, 4:22pm

I just tried it on some other machines (CPU only). The problem appears on them to.