Keras and pytorch CNN2d gives different output shape

I have a following Keras model

import tensorflow as tf
import keras
from keras.layers import Conv2D
from keras.layers import Input
X_2D = Input(shape=(1,5000,1)) # Input is EEG signal 1*5000 with channel =1
cnn2d = Conv2D(32, (1,10),activation='relu')(X_2D) # filters=32, kernel= (1,10)
print(X_2D.shape,'->',cnn2d.shape)

The output is: (1, 5000, 1) → (None, 1, 4991, 32)

I tried to implement the same in pytorch

a = torch.randn(1,5000,1)  
m = nn.Conv2d(1, 32, (10,1)) # filters=32, kernel= (10,1)
out = m(a)
print(out.size())

The output is:torch.Size([32, 4991, 1])

The channel is placed different in the keras and pytorch convolutional model. How can I get the same shape? Thank you very much

Keras seems to use the channels-last memory layout while PyTorch defaults to channels-first.
If you need to create matching shapes manually you could permute the PyTorch tensor.

1 Like

Thank you very much for your inputs.
Did you mean to permute the input matrix ? Also, In keras the kernel size is (1,10). If I use the same kernel size for the input (1,5000,1) it throws error in pytorch. Thats why I changed it to (10,1) in CNN model of pytorch. Would permute of input matrix would resolve this error as well.

Your PyTorch input is missing the batch dimension and is thus interpreted as [channels=1, height=5000, width=1].
A kernel size of [height=1, width=10] won’t work as the width of the kernel would be larger than the padded input width.

You also won’t be able to permute the input directly, since PyTorch expects inputs in channels-first layout. The recommendation to use permute on the output was purely in case you need to compare the results between TF and PyTorch elementwise.

Thank you very much. I get the following error:

RuntimeError: Given groups=1, weight of size [32, 1, 1, 10], expected input[1, 6000, 1, 301] to have 1 channels, but got 6000 channels instead
My model is:
class ConvNet1D(nn.Module):
def init(self):
super(ConvNet1D,self).init()
self.features = nn.Sequential(
nn.Conv2d(1, 32, kernel_size=(1,10), stride=1, padding=0), # change this
nn.ReLU(inplace=True),
)
self.flat = nn.Flatten()
#ialize our softmax classifier
self.fc2 = nn.Linear(in_features=3213504, out_features=2) #classes =2
self.logSoftmax = nn.LogSoftmax(dim=1)
def forward(self, input1, input2,input3,input4,input5,input6,input7,input8,input9,input10,input11,input12):
xe1 = self.features(input1)
xe2= self.features(input2)
xe3 = self.features(input3)
xe4 = self.features(input4)
xe5 = self.features(input5)
xe6 = self.features(input6)
xe7 = self.features(input7)
xe8 = self.features(input8)
xe9 = self.features(input9)
xe10 = self.features(input10)
xe11 = self.features(input11)
xe12 = self.features(input12)
f1=self.flat(xe1)
f2=self.flat(xe2)
f3=self.flat(xe3)
f4=self.flat(xe4)
f5=self.flat(xe5)
f6=self.flat(xe6)
f7=self.flat(xe7)
f8=self.flat(xe8)
f9=self.flat(xe9)
f10=self.flat(xe10)
f11=self.flat(xe11)
f12=self.flat(xe12)

    #con = torch.cat((f1), 1) #f2,f3,f4,f5,f6,f7,f8,f9,f10,f11,f12
    combined = torch.cat((f1.view(f1.size(0), -1),
                          f2.view(f2.size(0), -1),f3.view(f1.size(0), -1),
                          f4.view(f1.size(0), -1),f5.view(f1.size(0), -1),
                          f6.view(f1.size(0), -1),f7.view(f1.size(0), -1),
                          f8.view(f1.size(0), -1),f9.view(f1.size(0), -1),
                          f10.view(f1.size(0), -1),f11.view(f1.size(0), -1),f12.view(f1.size(0), -1),), dim=1)
    f3=self.flat(combined)
    fc=self.fc2(f3)
    x=self.logSoftmax(fc)
    return x

and this is how I train
model.train()

trainECG1=trainECG1.to(device)
trainECG2=trainECG2.to(device)
trainECG3=trainECG3.to(device)
trainECG4=trainECG4.to(device)
trainECG5=trainECG5.to(device)
trainECG6=trainECG6.to(device)
trainECG7=trainECG7.to(device)
trainECG8=trainECG8.to(device)
trainECG9=trainECG9.to(device)
trainECG10=trainECG10.to(device)
trainECG11=trainECG11.to(device)
trainECG12=trainECG12.to(device)
trainY=trainY.to(device)
valY=valY.to(device)
valECG1 = valECG1.to(device)
valECG2 = valECG2.to(device)
valECG3 = valECG3.to(device)
valECG4 = valECG4.to(device)
valECG5 = valECG5.to(device)
valECG6 = valECG6.to(device)
valECG7 = valECG7.to(device)
valECG8 = valECG8.to(device)
valECG9 = valECG9.to(device)
valECG10 = valECG10.to(device)
valECG11 = valECG11.to(device)
valECG12 = valECG12.to(device)


optimizer.zero_grad()

y_train_pred = model(trainECG1,trainECG2,trainECG3,trainECG4,trainECG5,trainECG6,trainECG7,trainECG8,
                     trainECG9,trainECG10,trainECG11,trainECG12)

train_loss = criterion(y_train_pred, trainY.unsqueeze(1))

My inputs size are 60001301 each. I mean, trainECG1= 60001301, trainECG2=60001301,etc.
No. of samples = 6000, I have 1* 301 time series ECG signal. My output is binary classification.

Can you please help me to resolve the error?
Thank you very much.

Your input is again missing a dimension and the 6000 samples are interpreted as the channel dimension.
For nn.Conv2d layers I would recommend to pass the inputs in the shape [batch_size, channels, height, width].

Thank you very much for the feedback.
Can you please confirm if this is the right way to do change the input dimension?
one sample: batch_size: 32,channels: 1,height:1,width:301
for 6000 samples : batch_size: 32,channels: 1,height:6000,width:301
Currently my training matrix for one input1 is: 6000x1x301 and input2: is 6000x1x301,… etc., for 12 inputs.
How should I change the input dimension to resolve the error?
Thank you very much

No, I don’t think this is correct since you are moving the number of samples into the height dimension and keep the batch size static as 32, which indicates 32 samples.

Thank you very much. I am a pytorch newbie. I have implemented the same in keras. This is how i trained in keras for the same inputs. Could you please suggest me how to modify this to run in the pytorch?

train1 = train[:,0:1,:] # shape is 6000\*1\*301
train2 = train[:,1:2,:]
train3 = train[:,2:3,:]
train4 = train[:,3:4,:]
train5 = train[:,4:5,:]
train6 = train[:,5:6,:]
train7 = train[:,6:7,:]
train8 = train[:,7:8,:]
train9 = train[:,8:9,:]
train10 = train[:,9:10,:]
train11 = train[:,10:11,:]
train12 = train[:,11:12,:]
RawInput1 = Input(shape=(1,301,1))
RawInput2 = Input(shape=(1,301,1))
RawInput3 = Input(shape=(1,301,1))
RawInput4 = Input(shape=(1,301,1))
RawInput5 = Input(shape=(1,301,1))
RawInput6 = Input(shape=(1,301,1))
RawInput7 = Input(shape=(1,301,1))
RawInput8 = Input(shape=(1,301,1))
RawInput9 = Input(shape=(1,301,1))
RawInput10 = Input(shape=(1,301,1))
RawInput11 = Input(shape=(1,301,1))
RawInput12 = Input(shape=(1,301,1))

model = Model(inputs=[RawInput1, RawInput2, RawInput3, RawInput4, RawInput5, RawInput6, RawInput7, RawInput8, RawInput9, RawInput10, RawInput11, RawInput12], outputs=output)
optim = tf.keras.optimizers.Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, amsgrad=False)
        
auc = tf.keras.metrics.AUC()
model.compile(optimizer=optim, loss='binary_crossentropy', metrics=['accuracy', auc])

history = model.fit([train1, train2, train3, train4, train5, train6, train7, train8, train9, train10, train11, train12], trainY, 
                     validation_data = ([val1, val2, val3, val4, val5, val6, val7, val8, val9, val10, val11, val12], valY), epochs=10000, batch_size=32)

I would really appreciate your inputs to resolve this. Thank you for your patience.

I don’t know how exactly you are loading your dataset, but this tutorial might be a good starter.
In general you should pass inputs as [batch_size, channels, height, width] to nn.Conv2d layers while you are using [batch_size=1, channels=6000, height=1, width=301].
Based on your description it seems you are also trying to pass all samples to the model instead of batches, so you might either slice the inputs manually in the sample dimension or use a DataLoader.

Thank you for your clarification. I use a DataLoader now and here is the complete code . The dataset dimension is 50x12x301(time series data, num of channel=1). No. of samples =50.

import matplotlib.pyplot as plt
import numpy as np
import torch
from torch import nn, optim
from torchvision import datasets, transforms
import torch.nn.functional as F
import scipy.io as sio
from random import seed
from torch.utils.data import Dataset, DataLoader

foldData = r'C:\Users\Geerthy\CovidData_Codes\data_sig.mat'
data = sio.loadmat(foldData)  # dimension: 50x12x301
class Dataset(Dataset):
    def __init__(self):
        #dataloading
        xy= data['data']
        self.x=torch.from_numpy(xy).float() # data
        self.y=torch.from_numpy(data['label']).float() #labels
        self.n_samples=xy.shape[0] # number of samples
        
    def __getitem__(self, index):
        return self.x[index], self.y[index] # return a tuple 
    
    def __len__(self):
        return self.n_samples  # length of function

dataset = Dataset()    
total_count=50
train_count = int(0.7 * total_count) 
valid_count = int(0.2 * total_count)
test_count = total_count - train_count - valid_count
train_dataset, valid_dataset, test_dataset = torch.utils.data.random_split(
    dataset, (train_count, valid_count, test_count)
) 
#train_dataset: 35x12x301 with 35 label, test_dataset: 5x12x301 with 5 label, valid_dataset: #10x12x301 with 10 label 
train_dataloader=DataLoader(dataset=train_dataset, batch_size=3, shuffle=True)
test_dataloader=DataLoader(dataset=test_dataset, batch_size=3, shuffle=True)
valid_dataloader=DataLoader(dataset=valid_dataset, batch_size=3, shuffle=True)
dataiter = iter(train_dataloader)
data = next(dataiter)
features, labels= data
print(features, labels)

# CNN network
class ConvNet1D(nn.Module):
    def __init__(self):
        super(ConvNet1D,self).__init__()
        self.layer1=nn.Conv2d(in_channels=1, out_channels=32, kernel_size=(1,10))
        self.relu1 = nn.ReLU()       
        self.flat = nn.Flatten()
        #ialize our softmax classifier
        self.fc2 = nn.Linear(in_features=32*12*292, out_features=2) #classes =2
        self.logSoftmax = nn.LogSoftmax(dim=1)
    def forward(self, x):
       x = self.relu1(self.layer1(x))
       x = self.flat(x)
       x = self.fc2(x)
       output = self.logSoftmax(x)
  		# return the output predictions
       return output

# Check for GPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

model = ConvNet1D()
model.to(device)
# from torchsummary import summary
# summary(ConvNet1D(), input_size = [(1,12,301)])
# Loss and optimizer
learning_rate=0.001
criterion = torch.nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Train the model
total_step = len(train_dataloader)
min_valid_loss = np.inf
num_epochs=2
train_loss = 0.0
for epoch in range(num_epochs):
    train_loss = 0.0
    for i,(features,label) in enumerate(train_dataloader):
            features = features.to(device)
            features = features.unsqueeze(0)
            #features = features.permute(1,0)
            label = label.to(device)
            # Clear the gradients
            optimizer.zero_grad()
            # Forward Pass
            target = model(features)
            # Find the Loss
            loss = criterion(target,label)
            # Calculate gradients
            loss.backward()
            # Update Weights
            optimizer.step()
            # Calculate Loss
            train_loss += loss.item()
            
            if (i+1)% 10 == 0:
                print(f'Epoch [{epoch+1}/{num_epochs}], Step[{i+1}/{total_step}], Loss:{loss.item():.4f}')

Here is the error:
RuntimeError: Given groups=1, weight of size [32, 1, 1, 10], expected input[1, 32, 12, 301] to have 1 channels, but got 32 channels instead.

I apologize for the recurring question. I couldnt resolve the error even after implementing DataLoader. Can you please tell me the modifications to the code?
Thank you very much.

The shape of the input shown in the error message:

does neither correspond to the shape you claim the input should have:

nor is the specified batch size of 3 in the DataLoader shown in any dimension:

Based on the error message you are still using a single sample with a channel dimension of 32 so I guess you are again passing a 3D input tensor to the model instead of a 4D one.

Thank you very much. I resolved the error. I need to include the following to make it the suitable dimension for CNN2d layer.
features.unsqueeze(1) # size of features before this step was (32,12,301) and after unsqueeze it is (32,1,12,301)
I have an another query. I am trying to solve the two class classification (whether the subject is normal or diseased based on ECG). So, my training data is (600,12,301) includes 3000 normal and 3000 diseased data. The corresponding label is (6000,2) (normal(0)/not normal(1)) This is how I trained the model.

learning_rate=0.001
criterion =nn.CrossEntropyLoss()# F.binary_cross_entropy_with_logits()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
train_loss = 0.0
correct = 0
    for i,(features,label) in enumerate(train_dataloader):
        features = features.unsqueeze(1)    
        features = features.to(device)
             #(32,1,12,301)
        label = label.to(device)
        # Clear the gradients
        optimizer.zero_grad()
        # Forward Pass
        target = model(features)
        # Find the Loss
        loss = criterion(target, label)
        # Calculate gradients
        loss.backward()
        # Update Weights
        optimizer.step()
        # Calculate Loss
        train_loss += loss.item()
        _, predicted = torch.max(target, 1)
        _, actual = torch.max(label, 1)
         correct = (predicted == actual).sum().item()

Is the actual and predicted is rightly calculated as I have label as (6000,2)? I am not sure this is how I need to calculate the prediction for Crossentropy?

Thank you very much

nn.CrossEntropyLoss expects a model output in the shape [batch_size, nb_classes, *] containing logits and a target in the shape [batch_size, *] containing class indices or in the same shape as the output containing probabilities. I assume the latter is the case here.

In this case your code looks alright.

Thank you very much. Can you please help me to understand the perspective behind the line

_, predicted = torch.max(target, 1) # (32,2)
_, actual = torch.max(label, 1) #(32,2)

I understand cross entropy gives the probabilities, henceforth we take torch.max(target,1) for prediction. Why we have to use torch.max(label,1) for label in this case?

I am a beginner trying to understand the concepts. I apologize if its a very silly question.
Thank you.

Assuming you are one-hot encoding the labels you could also see them as a probability where 1 indicates the active class and 0 the inactive one.
Here is a small example:

# create class labels
y = torch.randint(0, 2, (5,))
print(y)
# tensor([0, 1, 0, 0, 1])

# create one-hot encoded tensor
labels = F.one_hot(y, num_classes=2)
print(labels)
# tensor([[1, 0],
#         [0, 1],
#         [1, 0],
#         [1, 0],
#         [0, 1]])

# transform back to class indices
preds = torch.argmax(labels, dim=1)
print(preds)
# tensor([0, 1, 0, 0, 1])
1 Like

Hi
I have attached the figure of training and validation loss with an implementation of early stopping method. I couldn’t understand why the validation loss almost remains the same (minimal change in loss)and the training loss too. Batch size = 32, learning rate = 0.0001, (0.001 also I got the same response), The dataset is divided into 70%, 20%, 10% as train, valid and test dataset respectively(total samples : 8144). I am using a 2D CNN one layer for binary classification applications. Can someone explain how exactly to interpret the model’s behavior from the graph or any suggestions to improve if the model is either underfitting or overfitting.


Thank you very much
Geerthy

You could try to overfit a small subset of the dataset, e.g. just a single (or a few) samples, to make sure your model is able to learn and overfit to it.

Thank you. I have tried for the fewer samples (100 splitted as 60 for training, 20 for validation and 10 for testing). Batch size is 8, learning rate is 0.001.


Is the model is overfitting? In general, the validation loss increases after certain epochs then it is overfitting. However, here there is lots of ups and downs in the validation loss. Is it safe to consider that the model is able to learn and overfit from the graph?
Thank you

The model is still learning so you should make sure the training loss is converging towards zero.