Translation keras to pytoch

berlin · July 20, 2023, 10:05am

I’m trying to translate the below 3layer CNN architecture from keras to pytorch. The usage of the model is to predict expression value(input_shape_val) from dna sequence(input_shape_hot). The sequence is one hot encoded. The architecture orignally meant to train the model consecutively CNN (3 layers)-FC (2 layers) with batch normalization and weight dropout were applied after all layers and max-pooling after CNN layers( ref-paper, ref-code).

from typing import List
class DNA_CNN_test2(nn.Module):
    def __init__(self,
                 seq_len: int,
                 num_filters: List[int] = [32, 64, 128],
                 kernel_size: int = 3,
                 p = 0.2):
        super().__init__()
        self.seq_len = seq_len
        # CNN module
        self.conv_net = nn.Sequential()
        num_filters = [4] + num_filters
        for idx in range(len(num_filters) - 1):
            self.conv_net.add_module(
                f"conv_{idx}",
                nn.Conv1d(num_filters[idx], num_filters[idx + 1],
                          kernel_size=kernel_size, padding='same')
            )
            self.conv_net.add_module(f"relu_{idx}", nn.ReLU(inplace=True))
            self.conv_net.add_module(f"batchNor_{idx}",nn.BatchNorm1d(num_filters[idx + 1]))
            self.conv_net.add_module(f"MaxP_{idx}",nn.MaxPool1d(kernel_size=2,stride= 4))
            self.conv_net.add_module(f"dropout_{idx}",nn.Dropout(0.2))
        self.conv_net.add_module("flatten", nn.Flatten())
        self.conv_net.add_module("linear",nn.Linear(num_filters[-1]*seq_len, 1))
        #self.conv_net.add_module("linear",nn.Linear(64, 1))
        #self.conv_net.add_module("relu", nn.ReLU(inplace=True))
        #self.conv_net.add_module("batch_normal",nn.BatchNorm1d(64))
        #self.conv_net.add_module("drop",nn.Dropout(0.2))
        #self.conv_net.add_module("linear",nn.Linear(num_filters[-1]*seq_len, 1))
        
        
        
    def forward(self, xb: torch.Tensor):
        """Forward pass."""
        xb = xb.permute(0, 2, 1) 
        out = self.conv_net(xb)
        return out

however I am facing challenge on using MaxPool1d after Conv1d. I have already tried many suggestions from answers to similar questions, but none of them worked. Any suggestions about what I should do?

Here is the error code :
mat1 and mat2 shapes cannot be multiplied (2048x2048 and 128000x1)

ptrblck · July 21, 2023, 5:08am

The shape mismatch error seems to be raised in your linear layer and based on the error message the in_features value does not match the number of features of its input activation.
Assuming you are using 2048 samples each with a feature dimension of 2048, using in_features=2048 in the linear layer should work (and you should compare it to the Keras implementation to make sure the actual model architecture is the same).