Size mismatch error even after Flatteting in Conv1D

Hello, I am trying to build a convolution-based classifier for some time series data. while running the code I get an error saying -
RuntimeError: size mismatch, m1: [1 x 20592], m2: [20526 x 20658]
The model class is as follows -

class timeSeriesConv(nn.Module):
    def __init__(self, channels, seq_length, kernel_size=3, K=2):
        super(timeSeriesConv, self).__init__()
        self.channels = channels
        self.seq_length = seq_length
        self.kernel = kernel_size
        self.Conv1D = nn.Conv1d(in_channels=self.channels,
                                out_channels=self.channels,
                                kernel_size=self.kernel,
                                stride=1)
        self.criterion = nn.CrossEntropyLoss().cuda()


        self.depthwiseConv = nn.Conv1d(in_channels=self.channels,
                                       out_channels=K * self.channels,
                                       kernel_size=self.kernel,
                                       stride=1)
        self.fc1 = nn.Linear(in_features=K*self.channels * (self.seq_length - 2*(self.kernel-1)), out_features=(K*self.channels*self.seq_length))
        self.fc2 = nn.Linear(in_features=(K*self.channels*self.seq_length), out_features=4)

    def forward(self, X):
        out = nn.functional.elu(X)
        out = self.depthwiseConv(out)
        out = out.view(out.size(0), -1)
        out = nn.functional.elu(out)
        out = self.fc1(out)
        out = nn.functional.elu(out)
        return (self.fc2(out))

where channels = 22, seq_length = 313, K = 3 and kernel_size =2
input shape is [1, 22, 313]
Error is being thrown at the line

out = self.fc1(out)

Please tell me where I am going wrong.
TIA

Currently you are calculating the sequence length of the activation after self.depthwiseConv as self.seq_length - 2*(self.kernel-1) which is wrong, as the kernel will remove one signal value on each side.

in_features=K*self.channels * (self.seq_length - 2*(self.kernel//2))

should work.

Thanks for replying @ptrblck, I am getting the same error with the same mismatch sizes after trying out the solution.

This shouldn’t be the case, as you’ve changed the number of input features.
Is the error message exactly the same and are you sure you’ve updated the code (make sure to rerun the cell in case you are using a Jupyter notebook).

I did double-check and the error message is indeed the same with the same tensor sizes and thrown at the same line. The code is updated. Here is the error thrown

size mismatch, m1: [1 x 20592], m2: [20526 x 20658] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:290

@ptrblck could you please explain where the matrix m1 [1x20592 ] is coming from. Because output after convolutions should be 322 311 = [1, 20526] after flattening.
I even tried it out on dummy data with the same shape and the output tensor is as expected

The calculation is correct and my suggestion should fix this error:

class timeSeriesConv(nn.Module):
    def __init__(self, channels, seq_length, kernel_size=3, K=2):
        super(timeSeriesConv, self).__init__()
        self.channels = channels
        self.seq_length = seq_length
        self.kernel = kernel_size

        self.depthwiseConv = nn.Conv1d(in_channels=self.channels,
                                       out_channels=K * self.channels,
                                       kernel_size=self.kernel,
                                       stride=1)
        self.fc1 = nn.Linear(in_features=K*self.channels * (self.seq_length - 2*(self.kernel//2)), out_features=(K*self.channels*self.seq_length))
        self.fc2 = nn.Linear(in_features=(K*self.channels*self.seq_length), out_features=4)

    def forward(self, X):
        out = nn.functional.elu(X)
        out = self.depthwiseConv(out)
        out = out.view(out.size(0), -1)
        out = nn.functional.elu(out)
        out = self.fc1(out)
        out = nn.functional.elu(out)
        return (self.fc2(out))


x = torch.randn(1, 22, 313)
model = timeSeriesConv(22, 313, K = 3)
output = model(x)

Thank you @ptrblck that worked perfectly.
On an urrelated topic, the cross entropy loss is coming too high, it oscillated between 15-25. train and dataloaders are given below. Could you please help me out with it.

    def train_model(self, data, model, epochs):
        cudnn.benchmark = True
        model.train()
        model.cuda()
        optimizer = torch.optim.Adam(model.parameters())
        loss = []
        min_loss = 100 # a random high value so that model loss is 
                       #always less than this value
        for epoch in range(0, epochs+1):
            avg_loss = 0
            st = time.time()
            for _,(x, y) in enumerate(data):
                optimizer.zero_grad()
                x = x.reshape(1, 22, 313)
                x = x.float().cuda()
                y = y.long().cuda()
                out = model(x)
                print(out)
                loss = self.criterion(out, y)
                loss.backward()
                optimizer.step()
                avg_loss += loss.item()

            et = time.time()
            print('--------------------------------------------------------------')
            print('TIME')
            print(et-st)
            print('LOSS')
            print(avg_loss/len(data))
            loss.append(avg_loss/len(data))
            if (avg_loss/len(data) < min_loss):
                torch.save(model.state_dict(), 'Conv.pth')
                min_loss = avg_loss/len(data)
        plt.scatter(loss, numpy.arange(10))
        plt.show()
        print('--------------   DONE TRAINING  -----------------')

the dataloader

def dataloader_train(path):
    os.chdir(path)
    train_data = []
    Data = []
    Labels = []
    classes = {'769':int(0), '770':int(1), '771':int(2), '772':int(3)}
    folders = ['769', '770', '771', '772']
    for folder in folders:
        files = os.listdir(path + '/' + folder)
        os.chdir(path + '/' + folder)
        for file in files:
            data = numpy.load(file)
            data = numpy.transpose(data)
            #Data.append(data)
            #Labels.append((int(folder)))
            train_data.append([data, torch.tensor(data=(classes[folder]), dtype=torch.int64)])

    train_loader = torch.utils.data.DataLoader(train_data, batch_size=1, shuffle=True)
    return train_loader, train_data

The Least loss this model could obtain was 15 where as the training parameters are a lot(way more than a million). Any idea why?

Try to overfit a small data sample (e.g. just 10 samples) to check, if the overall training routine doesn’t have any bugs.
I’m not completely sure, how your data loading pipeline works, so this might also be a check for this part of the code.
Once your model overfits the small sample, try to scale it up.

Sure, I will try it out.
Thanks a lot

@ptrblck Sorry for bothering so much. I changed my model to the following -

class timeSeriesConv(nn.Module):
    def __init__(self, channels, seq_length, kernel_size=3, k=2):
        super(timeSeriesConv, self).__init__()
        self.channels = channels
        self.seq_length = seq_length
        self.kernel = kernel_size
        self.criterion = nn.CrossEntropyLoss().cuda()
        self.conv1 = nn.Conv1d(in_channels=self.channels, out_channels=self.channels, kernel_size=self.kernel, stride=1)
        self.depthwiseConv = nn.Conv1d(in_channels=self.channels,
                                       out_channels=k * self.channels,
                                       kernel_size=self.kernel,
                                       stride=1)

        self.fc1 = nn.Linear(in_features=k * self.channels * 309,
                             out_features=(k * self.channels * self.seq_length))

        self.fc2 = nn.Linear(in_features=(k * self.channels * self.seq_length),
                             out_features=2 * k * self.channels * self.seq_length)

        self.fc3 = nn.Linear(in_features=2 *k* self.channels * self.seq_length,
                             out_features=2 *k* self.channels * self.seq_length)

        self.fc4 = nn.Linear(in_features=2 *k* self.channels * self.seq_length,
                             out_features=(k*self.channels * self.seq_length) // 2)

        self.fc5 = nn.Linear(in_features=(self.channels * self.seq_length) // 2,
                             out_features=(k*self.channels * self.seq_length) // 4)
        self.fc6 = nn.Linear(in_features=(self.channels * self.seq_length) // 4, out_features=4)

    def forward(self, X):
        out = self.conv1(X)
        out = nn.functional.relu(out)
        out = self.depthwiseConv(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = nn.functional.relu(out)
        nn.Dropout(0.5)
        out = self.fc2(out)
        out = nn.functional.relu(out)
        nn.Dropout(0.3)
        out = self.fc3(out)
        out = nn.functional.relu(out)
        out = self.fc4(out)
        out = nn.functional.relu(out)
        nn.Dropout(0.25)
        out =self.fc5(out)
        out = nn.functional.relu(out)
        return self.fc6(out)

whereas def train is -

    def train_model(self, data, model, epochs):
        cudnn.benchmark = True
        model.train()
        model.cuda()
        optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
        loss = []
        min_loss = 100
        for epoch in range(0, epochs + 1):
            avg_loss = 0
            st = time.time()
            for _, (x, y) in enumerate(data):
                optimizer.zero_grad()
                x = x.reshape(1, 22, 313)
                x = x.float().cuda()
                y = y.long().cuda()
                out = model(x)
                print(out)
                loss = self.criterion(out, y)
                loss.backward()
                optimizer.step()
                avg_loss += loss.item()

As you can see, everything has .cuda() following it so it should be loaded onto the GPU memory. But now after changing the architecture, system RAM gets filed up all the way upto swap and more. Since the dataset has not changed, it cannot be the issue as I ran the model before.
Could you help please ?

How much RAM is used by this model and training loop?

Also, the nn.Dropout layers in the forward pass won’t be used, as you are not calling the module with the activation.

As soon as I hit run, after giving the output for one sample in the first epoch RAM gets filled till the optimizer.step( ) line.
I have 16gb + 2gb swap both get filled up. GPU is RTX 2070

The posted code creates a shape mismatch again using:

model = timeSeriesConv(22, 313, kernel_size=2, k=3).cuda()
x = torch.randn(1, 22, 313).cuda()
output = model(x)
> RuntimeError: size mismatch, m1: [1 x 20526], m2: [20394 x 20658]

There is some silly calculation mistake I am making in calculating the in_features of fc1 which I am not able to catch. The above code only works for odd kernel sizes so for now I just went ahead it with.