Transfering a model from keras to PyTorch

I have a working model from this paper: https://arxiv.org/pdf/2006.08296.pdf
in keras that I’d like to re-write in with pytorch.

class CaptchaSolver(nn.Module):
    def __init__(self):
        super(CaptchaSolver, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=1,
                               kernel_size=(5, 5), padding=[2, 2])
        self.dense1 = nn.Linear(67, 32)
        self.conv2 = nn.Conv2d(in_channels=1, out_channels=1,
                               kernel_size=(5, 5), padding=[2, 2])
        self.dense2 = nn.Linear(33*12, 48)
        
        self.conv3 = nn.Conv2d(in_channels=1, out_channels=1,
                               kernel_size=(5, 5), padding=[2, 2])
        self.dense3 = nn.Linear(16*8, 64)
        
        self.dense4 = nn.Linear(8*4, 512)
        self.drop = nn.Dropout2d(0.3)
        
        self.out_dense1 = nn.Linear(12288, 31)
        self.out_dense2 = nn.Linear(12288, 31)
        self.out_dense3 = nn.Linear(12288, 31)
        self.out_dense4 = nn.Linear(12288, 31)
        self.out_dense5 = nn.Linear(12288, 31)
        self.out_dense6 = nn.Linear(12288, 31)
        
        self.maxp = nn.MaxPool2d((2, 2))
        self.softm = nn.Softmax(dim=1)
        self.relu = nn.ReLU()
    def forward(self, x):
        x = self.conv1(x)
        x = self.dense1(x)
        x = self.relu(x)
        x = self.maxp(x)
        x = self.conv2(x)
        x = self.dense2(x)
        x = self.relu(x)
        x = self.maxp(x)
        x = self.conv3(x)
        x = self.dense3(x)
        x = self.relu(x)
        x = self.maxp(x)
        x = self.dense4(x)
        x = self.drop(x)
        x = torch.flatten(x)
        x1 = self.softm(self.out_dense1(x))
        x2 = self.softm(self.out_dense2(x))
        x3 = self.softm(self.out_dense3(x))
        x4 = self.softm(self.out_dense4(x))
        x5 = self.softm(self.out_dense5(x))
        x6 = self.softm(self.out_dense6(x))
        out = torch.cat([x1, x2, x3, x4, x5, x6])
        return out 

If I pass the last dimension of my tensor as in_features to the linear layer it changes it to out_features of the layer, and if I pass 67 * 25 * 1, which is the width * height * channels of generated captchas, it stops with a RuntimeError: mat1 dim 1 must match mat2 dim 0 error. I would like for it to work as close to my keras model as possible. What should I change?

First, you have to change the ’out_channels of your conv layer in order to stay as close as possible to your keras model. E.g. conv1 needs to have out_channels=2.

As a next step, your dense layers need to be implemented in another way. The first dense layer e.g. has 96 parameters which means

        self.dense1 = nn.Linear(2, 32)

The dense layer operates in the last dimension as far as I know, so in my opinion you would need something like

 x = x.view(1, -1, 2)
 x = self.dense1(x)
 x = x.view(1, 32, 25, 67)

with batch_size = 1. Or transposing would also be an option. But to be honest, I dont really know if this is the best solution or implementation, I just wanted to raise awareness for the differences between the given model and yours.

Best regards

1 Like

I did something similar to what you described and it seems like it’s working, thanks!