If I have:

```
self.layer1 = torch.nn.Conv1d(in_channels=512, out_channels=512, kernel_size=1)
```

isn’t that equivalent to

```
self.layer1 = torch.nn.Linear(512, 512)
```

?

If I have:

```
self.layer1 = torch.nn.Conv1d(in_channels=512, out_channels=512, kernel_size=1)
```

isn’t that equivalent to

```
self.layer1 = torch.nn.Linear(512, 512)
```

?

Yes, should be the case:

```
# Setup
conv = torch.nn.Conv1d(in_channels=512, out_channels=512, kernel_size=1).double()
lin = torch.nn.Linear(512, 512).double()
# use same param values
with torch.no_grad():
lin.weight = nn.Parameter(conv.weight.squeeze(2))
lin.bias = nn.Parameter(conv.bias)
# forward
x = torch.randn(2, 512, 20).double()
out_conv = conv(x)
# permute for linear
x_lin = x.permute(0, 2, 1)
out_lin = lin(x_lin)
# check forward output
print(torch.allclose(out_lin.permute(0, 2, 1), out_conv))
> True
print((out_lin.permute(0, 2, 1) - out_conv).abs().max())
> tensor(1.2212e-15, dtype=torch.float64, grad_fn=<MaxBackward1>)
# check backward
out_conv.mean().backward()
out_lin.mean().backward()
print(torch.allclose(conv.weight.grad.squeeze(2), lin.weight.grad))
> True
print(torch.allclose(conv.bias.grad, lin.bias.grad))
> True
```

1 Like

Thanks so much. So there’s literally no difference, not even in terms of computation?

There is most likely a difference in computation in particular if you are using CUDA operations. E.g. convolutions would be dispatched to cudnn, if you are using an NVIDIA GPU, which could internally call into cublas (same as in the linear layer), but isn’t guaranteed.

I don’t know, which methods are exactly called on the CPU.

For my code snippet the convolution would use `cudnn::cnn::implicit_convolve_dgemm`

, while the linear layer would call into `volta_dgemm_128x64_tn`

.

1 Like