Passing 1d data through Conv2d instead of Conv1d

Does plugging in a 1 dimensional data through Conv2d with kernal size (n,1) give the same result as a Conv 1d?
For sake of illustration, say we have an input with (1024,9,128) and a Conv1d layer with a kernel size of 2. Instead of passing this through a Conv1d, Can I instead pass it through Conv2D with an input size of (1024,9,128,1) and a kernel size of (2,1). Will it give me the same result in both cases. If so, what is the exact purpose of Conv1d

2 Likes

Yes, it should give the same results.
Here is a small script to reproduce this:

x = torch.randn(1024,9,128)

conv1d = nn.Conv1d(9, 1, 2)
conv2d = nn.Conv2d(9, 1, (2, 1))
with torch.no_grad():
    conv2d.weight.copy_(conv1d.weight.unsqueeze(3))
    conv2d.bias.copy_(conv1d.bias)

output1d = conv1d(x)
output2d = conv2d(x.unsqueeze(3))

print((output1d == output2d.squeeze(3)).all())

I would guess it’s convenient to use nn.Conv1d, if your data has only one temporal dimension, instead of having to specify two dimensions for the kernel_size, padding and stride explicitly.
Also, I’m not sure about the underlying implementations (especially how cuDNN handles 1D data), which might result in speed differences.

6 Likes

@ptrblck How would you convert this model into Conv2D layers?

class Conv1DModel(nn.Module):
    def __init__(self, n_input=1, n_output=35, stride=16, n_channel=32):
        super(Conv1DModel, self).__init__()
        self.conv1 = nn.Conv1d(n_input, n_channel, kernel_size=80, stride=stride)
        self.conv2 = nn.Conv1d(n_channel, n_channel, kernel_size=3)
        # self.dropout1 = nn.Dropout(0.25)
        # self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(3952, n_channel)
        self.fc2 = nn.Linear(n_channel, n_output)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        # x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        # x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

I have tried this so far, what do you think?

class Conv2DModel(nn.Module):
    def __init__(self, n_input=1, n_output=35, stride=16, n_channel=32):
        super(Conv2DModel, self).__init__()
        self.conv1 = nn.Conv2d(n_input, n_channel, kernel_size=(80,1), stride=(16,16))
        self.conv2 = nn.Conv2d(n_channel, n_channel, kernel_size=(1,3))
        # self.dropout1 = nn.Dropout(0.25)
        # self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(1976, n_channel)
        self.fc2 = nn.Linear(n_channel, n_output)

    def forward(self, x):
        x = self.conv1(x)
        print(x.shape)
        x = F.relu(x)
        print(x.shape)
        x = self.conv2(x)
        print(x.shape)
        x = F.relu(x)
        print(x.shape)
        x = F.max_pool2d(x, 2)
        # x = self.dropout1(x)
        x = torch.flatten(x, 1)
        print(x.shape)
        x = self.fc1(x)
        x = F.relu(x)
        # x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

I don’t know what your input shape is, but assuming that dim3 has the actual sequence length and dim4 is set to 1, you should specify the kernel_size, stride etc. to use (size, 1).

this is the input shape torch.Size([1, 32, 8000])

I don’t think that’s this would work, as it would crash with the same error described in your other post.
nn.Conv2d expects a 4-dimensional tensor, while you are posting shapes of a 3-dimensional tensor.
Make sure the code is executable by creating the model object and a random tensor with a specified shape.