# Pointwise Conv1d slower than Linear

When I use `torch.nn.Conv1d` to perform pointwise convolution, it seems significantly slower than `torch.nn.Linear`, while I assume these two operations should have similar speed.

Update: the following code and results are updated according to what @ptrblck suggested.

``````import torch
import time

torch.backends.cudnn.benchmark = True

def linear(x, times=1000):
m1 = torch.nn.Linear(512, 1024).cuda()
m2 = torch.nn.Linear(1024, 512).cuda()
torch.cuda.synchronize()
start = time.time()
for i in range(times):
h = m1(x)
y = m2(h)
torch.cuda.synchronize()
duration = (time.time() - start) / times
return duration

def conv1d(x, times=1000):
m1 = torch.nn.Conv1d(512, 1024, kernel_size=1, stride=1).cuda()
m2 = torch.nn.Conv1d(1024, 512, kernel_size=1, stride=1).cuda()
x = x.transpose(1, 2)
torch.cuda.synchronize()
start = time.time()
for i in range(times):
h = m1(x)
y = m2(h)
torch.cuda.synchronize()
duration = (time.time() - start) / times
return duration

if __name__ == '__main__':
# Time x Batch x Channel
x = torch.randn(50, 80, 512).cuda()
print(f'{linear.__name__}: {linear(x):.6f}s')
print(f'{conv1d.__name__}: {conv1d(x):.6f}s')
``````

Ouput is:

linear: 0.001002s
conv1d: 0.001185s

I am using:

Hardware: Nvidia 1080Ti
Library: The latest pytorch built from source with Cuda8.0, cudnn6.0.

Could you add a synchronization before the start of the timer?
Note that even though the operations should be similar, cuDNN etc. might chose specific algorithms which might be faster or slower than the counterpart.
Also, try to use `torch.backends.cudnn.benchmark = True` and let the operations run a few times before timing them.

Thanks! I already rerun the experiment according to your advice, but the result still suggests that `Conv1d` is about 20% slower than `Linear`.