Does plugging in a 1 dimensional data through Conv2d with kernal size (n,1) give the same result as a Conv 1d?
For sake of illustration, say we have an input with (1024,9,128) and a Conv1d layer with a kernel size of 2. Instead of passing this through a Conv1d, Can I instead pass it through Conv2D with an input size of (1024,9,128,1) and a kernel size of (2,1). Will it give me the same result in both cases. If so, what is the exact purpose of Conv1d
Yes, it should give the same results.
Here is a small script to reproduce this:
x = torch.randn(1024,9,128)
conv1d = nn.Conv1d(9, 1, 2)
conv2d = nn.Conv2d(9, 1, (2, 1))
with torch.no_grad():
conv2d.weight.copy_(conv1d.weight.unsqueeze(3))
conv2d.bias.copy_(conv1d.bias)
output1d = conv1d(x)
output2d = conv2d(x.unsqueeze(3))
print((output1d == output2d.squeeze(3)).all())
I would guess it’s convenient to use nn.Conv1d
, if your data has only one temporal dimension, instead of having to specify two dimensions for the kernel_size
, padding
and stride
explicitly.
Also, I’m not sure about the underlying implementations (especially how cuDNN handles 1D data), which might result in speed differences.
@ptrblck How would you convert this model into Conv2D layers?
class Conv1DModel(nn.Module):
def __init__(self, n_input=1, n_output=35, stride=16, n_channel=32):
super(Conv1DModel, self).__init__()
self.conv1 = nn.Conv1d(n_input, n_channel, kernel_size=80, stride=stride)
self.conv2 = nn.Conv1d(n_channel, n_channel, kernel_size=3)
# self.dropout1 = nn.Dropout(0.25)
# self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(3952, n_channel)
self.fc2 = nn.Linear(n_channel, n_output)
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
# x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
# x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
I have tried this so far, what do you think?
class Conv2DModel(nn.Module):
def __init__(self, n_input=1, n_output=35, stride=16, n_channel=32):
super(Conv2DModel, self).__init__()
self.conv1 = nn.Conv2d(n_input, n_channel, kernel_size=(80,1), stride=(16,16))
self.conv2 = nn.Conv2d(n_channel, n_channel, kernel_size=(1,3))
# self.dropout1 = nn.Dropout(0.25)
# self.dropout2 = nn.Dropout(0.5)
self.fc1 = nn.Linear(1976, n_channel)
self.fc2 = nn.Linear(n_channel, n_output)
def forward(self, x):
x = self.conv1(x)
print(x.shape)
x = F.relu(x)
print(x.shape)
x = self.conv2(x)
print(x.shape)
x = F.relu(x)
print(x.shape)
x = F.max_pool2d(x, 2)
# x = self.dropout1(x)
x = torch.flatten(x, 1)
print(x.shape)
x = self.fc1(x)
x = F.relu(x)
# x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
I don’t know what your input shape is, but assuming that dim3 has the actual sequence length and dim4 is set to 1, you should specify the kernel_size
, stride
etc. to use (size, 1)
.
this is the input shape torch.Size([1, 32, 8000])
I don’t think that’s this would work, as it would crash with the same error described in your other post.
nn.Conv2d
expects a 4-dimensional tensor, while you are posting shapes of a 3-dimensional tensor.
Make sure the code is executable by creating the model object and a random tensor with a specified shape.