Transfer Conv2D layers from Keras to Pytorch Give a Wrong Result

malioboro · July 16, 2023, 3:42am

I create this simple model in Keras:

model = keras.Sequential([
    Conv2D(3, (3, 3), activation='relu'),
    Flatten(),
    Dense(10, activation='softmax')
])
keras_weights = model.get_weights()

And I also create a similar model in PyTorch:

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 3, kernel_size=3)
        self.fc = nn.Linear(2700, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = x.view(-1, 2700)
        x = self.fc(x) 
        return F.softmax(x)
pt_model = Net()

I know the Keras parameter’s shape is in a different order than PyTorch, as discussed here. So this is the way I transfer Keras parameters to PyTorch:

for i, (n, p) in enumerate(pt_model.named_parameters()):
    with torch.no_grad():
        if 'conv' in n and 'weight' in n:
            p.copy_(torch.from_numpy(np.transpose(keras_weights[i], (3,2,0,1)))) # for conv layer's weight
        elif 'weight' in n:
            p.copy_(torch.from_numpy(np.transpose(keras_weights[i], (1,0)))) # for dense layer's weight
        else:
            p.copy_(torch.from_numpy(keras_weights[i]))

There is no “size mismatch error” so far. However, when I compared the result, it showed different results:

x = np.ones((1,32,32,3))/2 # for keras
y = torch.ones((1,3,32,32))/2 # for pytorch

pt_model.eval()
print(pt_model(y))
print(model(x))

# output:
# [[0.1495, 0.0770, 0.0740, 0.0601, 0.1135, 0.1814, 0.1109, 0.0918, 0.0777, 0.0640]] # pytorch
# [[0.13737796 0.0906269  0.12652941 0.08460842 0.11374795 0.10845644 0.07447342 0.0781411   0.10338538 0.08265299]] # keras

But when I tried to remove the conv layer (the model is only a single dense layer), the transfer worked, and the result showed the same value (slightly different in precision). Can someone help me to find out what’s the problem with the transfer process?

Here I provide the code in Colab Google Colab