I know this is not new, however, after reading many explanations I am still really confused about the parameters which are required for Conv1D. Specifically, I have a dataset which contains 154k rows, and each rows is a 1D array of 289 floats. Now I want to train my model using batches with batch size = 50 (this is dynamic). In my model, there are some other type of layers such as Linears. The source code like this:
self.linear1 = nn.Linear(noise_dim, sequenceLength)
self.linear2 = nn.Linear(sequenceLength, 2*sequenceLength)
self.linear3 = nn.Linear(2*sequenceLength, 2*sequenceLength)
self.linear4 = nn.Conv1d(2*sequenceLength, sequenceLength,kernel_size=sequenceLength+1, stride=1, padding=0)
I expect to use Conv1D to reduce the size of input from 2*SequenceLength to a SequenceLength. However, it display the error of (289 is my sequenceLength in execution)
Given groups=1, weight of size [289, 578, 290], expected input[1, 50, 578] to have 578 channels, but got 50 channels instead
It took my batch size into account, if so, in the training section, if I need to change the batch size, then we need to upgrade our models?
Moreover, if taking the batch size together with the input, I think it should be Conv2d, am I right?
Thank you
nn.Conv1d
expects a batched 3-dimensional input in the shape [batch_size, in_channels, seq_length]
or an unbatched 2-dimensional input in the shape [in_channels, seq_length]
. Based on the error message you are using the second approach and the layer unsqueezes the batch dimension for you. Note that conv layers use a kernel and apply a convolution (or rather cross-correlation) on the input. To simply reduce the dimension of the input you could still apply a linear layer. In case your activation has a temporal dimension and a kernel should be applied over it you might want to unsqueeze
the channel dimension manually.
The in_channels
argument should be a single int
value not a list
.
I think I got your idea, thank you very much
Hi @ptrblck
This is my new update
def __init__(self, noise_dim, sequenceLength):
super(Generator,self).__init__()
self.linear1 = nn.Linear(noise_dim, sequenceLength)
self.linear2 = nn.Linear(sequenceLength, 2*sequenceLength)
self.linear3 = nn.Linear(2*sequenceLength, 2*sequenceLength)
self.conv4 = nn.Conv1d(2*sequenceLength, sequenceLength,kernel_size=3, padding=0)
self.lrelu = nn.LeakyReLU(0.01)
self.tanh = nn.Tanh()
self.sigmoid = nn.Sigmoid()
def forward(self, x):
out = self.linear1(x)
out = self.lrelu(out)
out = self.linear2(out)
out = self.tanh(out)
out = self.linear3(out)
out = self.lrelu(out)
out = out.unsqueeze(2)
out = self.conv4(out)
out = out.squeeze()
out = self.sigmoid(out)
return out
I add unsqueeze(2) in the forward functions and it can work fine except for the kernel size.
If I increase the kernel size to any number greater than 1, I will receive the error:
Calculated padded input size per channel: (1). Kernel size: (3). Kernel size can't be greater than actual input size
I wonder my sequence length is 289 (in execution), with 1 dimension, why the kernel cannot be some thing like 3 or 4,… so on.
Because you are unsqueezing the temporal dimension and use the sequence length as the channels. Check the expected input shape described in my previous post.
@ptrblck , I change the padding to 1, and it work (with kernel = 3).
But according to your answer, if the sequence turn into the channels, I wonder the if the kernel keep working correctly or not? Or it only return without stride?
Thank you
Using a padded input would avoid running into functionality issues, but note that only the “center” pixel of the conv kernel will be used as seen here:
conv = nn.Conv1d(10, 1, kernel_size=3, padding=1, bias=False)
x = torch.ones(1, 10, 1)
out = conv(x)
print(out)
# tensor([[[0.7653]]], grad_fn=<ConvolutionBackward0>)
print(conv.weight[:, :, 1].sum())
# tensor(0.7653, grad_fn=<SumBackward0>)