1D Convolution Data Shaping

I know it might be intuitive to others but i have a huge confusion and frustration when it comes to shaping data for convolution either 1D or 2D as the documentation makes it looks simple yet it always gives errors because of kernel size or input shape, i have been trying to understand the datashaping from the link [1], basically i am attempting to use Conv1D in RL. the Conv1D should accept data from 12 sensors, 25 timesteps.
The data shape is (25, 12)

I am attempting to use the below model

class DQN_Conv1d(nn.Module):
    def __init__(self, input_shape, n_actions):
        super(DQN_Conv1d, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv1d(input_shape[0], 32, kernel_size=4, stride=4),
            nn.ReLU(),
            nn.Conv1d(32, 64, kernel_size=4, stride=2),
            nn.ReLU(),
            nn.Conv1d(64, 64, kernel_size=3, stride=1),
            nn.ReLU(),
            nn.Linear(64, 512),
            nn.ReLU(),
            nn.Linear(512, n_actions)
        )

    def forward(self, x):
        return self.conv(x)

but i get error
RuntimeError: Calculated padded input size per channel: (1 x 3). Kernel size: (1 x 4). Kernel size can’t be greater than actual input size at c:\a\w\1\s\windows\pytorch\aten\src\thnn\gen
eric/SpatialConvolutionMM.c:50

How should i properly shape the data of 12 sensors and 25 data point for a 1D Convolution in PyTorch ?

Thanks in advance

[1] https://blog.goodaudience.com/introduction-to-1d-convolutional-neural-networks-in-keras-for-time-sequences-3a7ff801a2cf

Hi,

You can check shape of the input which causes the error (see error message carefully, kernel size == 4 but input size == 3, makes input size < kernel size). You can see it simply by print(input_hoge_hoge). Or, it implies more elements in input or less kernel size.

And you can use word of “python” at beginning of three quotations of “`” to insert your code between quotations makes python code readable with indent.

1 Like

Hi,
i added the below print statements to the main logic

    print(env.observation_space.shape)
    print(env.action_space.n)
    net = dqn_model.DQN_Conv1d(env.observation_space.shape, env.action_space.n).to(device)
    print(net)

And got the below in the terminal

(25, 12)
5
DQN_Conv1d(
  (conv): Sequential(
    (0): Conv1d(25, 32, kernel_size=(4,), stride=(4,))
    (1): ReLU()
    (2): Conv1d(32, 64, kernel_size=(4,), stride=(2,))
    (3): ReLU()
    (4): Conv1d(64, 64, kernel_size=(3,), stride=(1,))
    (5): ReLU()
    (6): Linear(in_features=64, out_features=512, bias=True)
    (7): ReLU()
    (8): Linear(in_features=512, out_features=5, bias=True)
  )
)

before i get the error:

RuntimeError: Calculated padded input size per channel: (1 x 3). Kernel size: (1 x 4). Kernel size can’t be greater than actual input size at c:\a\w\1\s\windows\pytorch\aten\src\thnn\gen
eric/SpatialConvolutionMM.c:50

Assume your input shape is [N, 25, 12], after the first Conv1d, it will become [N, 25, 3], while 3 is too short for next Conv1d with kernel_size=(4,).

Check https://pytorch.org/docs/stable/nn.html#torch.nn.Conv1d for output shape of Conv1d and try padding or changing the stride. But I think 12 is just not enough.

Hi @Ramzy_Karam -san,

Third print shows inside of the model. You see the kernel size for each convolution. Which one takes kernel size == 4 ?, yes first and second convolutions. I do not know the observation_space is input for your model or not, but let us assume so, then you can calculate output size as follows;

"output size" = ("input size" - "kernel size")/"stride factor" + 1

So first output size is (12 - 4)/4 + 1 = 3, thus on a second convolution, your model has 3-element input but your kernel size is 4. Matches to the error message, right? So your choice is; adjust input size for first one or adjust first kernel size and or adjust stride factor.

Thanks alot, but there is a confusing points for me that i hope you would clarify,

  • The " in_channels" should be an int representing the number of signals ? in my case 12
  • The shape of the input should be (1,number of data points, number of signals) in my case (1,25,12) ? and the one represent the batch number ?

Firstly, you take first element of input_shape, and first convolution takes it as input_shape[0] which is 12. I do not know how you did set input for your model. It is depending on your design, actually. As you mentioned in your first one, 25 is time step what you want. If you did set batch size in your code, then there might been batched input.

I do not know where is the “in_channel” in your comment. However, in general, channel means that a number of input tensors operated by the convolution.

I recommend that you must check your model’s input at first. You did set batch size? You did not set batch size? If you did set so, then let you change value of batch size from 25 to other number such as 3. Then you can check input shape again by print, if it has 3 then the original 25 was batch size, but if it is not it is probably other.

Thanks alot for your support, so seems like i screwed the whole thing,
First, i didn’t set batch size anywhere.
Second, i took the model from a book i was following, i now adjusted the strides to 1 that has passed the error but got me the below error in the Linear layer

RuntimeError: size mismatch, m1: [64 x 4], m2: [64 x 512] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

The 12 represent the number of features, i have more features to add. the 25 represent the number of time steps in a time series, i can add more also.
For the model i just used a model from a book i was following in Reinforcement Learning.

so i am still confused, for a batch size of 1, should the input shape is the number of features or the timesteps ? as far as i understand from the blog i added earlier its the timesteps
Also, where should i set the batch size, in the shape of the input ?

Ok,

  • First, you must check which is m1 and m2, just identifying object.
  • Second, you must check matrix operation how it work.
  • Third, you must check shape of tensor of operands (m1 and m2) for the fully-connected layer.

Through these checks, you can find out where you take misunderstand. I do not recommend to skip the three-step check, it makes more confusion. STEP by STEP is SHORTEST PATH.

Thanks alot for your help, i was able to do the first step but got confused on the other two.
Eventually following up with this URL [1] i rewrote the model, still the model is arbitrarily.

But is this the right way to use 1D convolution for 12 channels (sensors) and 25 data points (time steps) ?

The forward function accept the below shape
torch.Size([1, 25, 12])

The model:

class DQN_Conv1d(nn.Module):
    def __init__(self, input_shape, n_actions):
        super(DQN_Conv1d, self).__init__()
        self.conv1 = nn.Conv1d(input_shape[0], 32, kernel_size=4, stride=1)
        self.conv2 = nn.Conv1d(32, 64, kernel_size=4, stride=1)
        self.conv3 = nn.Conv1d(64, 64, kernel_size=3, stride=1)
        self.conv_drop = nn.Dropout(0.2)
        self.fc1 = nn.Linear(256, 512)
        self.fc2 = nn.Linear(512, n_actions)

    def forward(self, x):
        print(x)
        print(x.shape)
        x = F.relu(F.max_pool1d(self.conv1(x), 1))
        x = F.relu(F.max_pool1d(self.conv2(x), 1))
        x = F.relu(F.max_pool1d(self.conv_drop(self.conv3(x)), 1))
        x = x.view(-1, 256)
        x = F.relu(self.fc1(x))
        x = F.dropout(x, training=self.training)
        x = F.relu(self.fc2(x))
        return F.log_softmax(x)

[1] Inferring shape via flatten operator

Hi, @Ramzy_Karam -san,

Are you still on issue of;

RuntimeError: size mismatch, m1: [64 x 4], m2: [64 x 512] at c:\a\w\1\s\windows\pytorch\aten\src\th\generic/THTensorMath.cpp:940

If you are so, then, you must understand about matrix operation.
The matrix-matrix multiplication used in fully-connected layer can be written as follows;

for i in range(I)
  for j in range(J)
     c[i][j] = 0.
     for k in range(K)
       c[i][j] += a[i][k]*b[k][j]

You can see the relationship of indexes on each matrices. And you can compare shape of m1 and m2 in the error message.

Oh Thanks, i have passed all those issue using the model i wrote in the previous comment.
Now i am adjusting my algorithm for the Reinforcement learning, but i wanted to be validate with you that i am doing something that make sense. Is this the right way to use 1D convolution for 12 channels (sensors) and 25 data points (time steps) ?
I am not asking about data cleaning, hyper parameters or model complexity, i am asking if this is the right shape of data to the 1d convolution problem as (batches, data points, channels) or (batches, channels, data points) fed to the forward function ?

You can check these three value positions as argument by changing position like a shuffling.
If error is occurred, then the position is incorrect, right (three values should be different to make clear the problem)?

Shuffling them still works, but i want to be sure the data is aligned correctly
if the convolution works by running a kernel on timesteps to “summarize” them as a feature extraction, so am i giving it the data the way it should receive

No, I mean is to compare and check order of

batches, data points, channels

and

batches, channels, data points

Which is correct is to check which makes an error, no error is the one you want.