I create this simple model in Keras:
model = keras.Sequential([
Conv2D(3, (3, 3), activation='relu'),
Flatten(),
Dense(10, activation='softmax')
])
keras_weights = model.get_weights()
And I also create a similar model in PyTorch:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 3, kernel_size=3)
self.fc = nn.Linear(2700, 10)
def forward(self, x):
x = F.relu(self.conv1(x))
x = x.view(-1, 2700)
x = self.fc(x)
return F.softmax(x)
pt_model = Net()
I know the Keras parameter’s shape is in a different order than PyTorch, as discussed here. So this is the way I transfer Keras parameters to PyTorch:
for i, (n, p) in enumerate(pt_model.named_parameters()):
with torch.no_grad():
if 'conv' in n and 'weight' in n:
p.copy_(torch.from_numpy(np.transpose(keras_weights[i], (3,2,0,1)))) # for conv layer's weight
elif 'weight' in n:
p.copy_(torch.from_numpy(np.transpose(keras_weights[i], (1,0)))) # for dense layer's weight
else:
p.copy_(torch.from_numpy(keras_weights[i]))
There is no “size mismatch error” so far. However, when I compared the result, it showed different results:
x = np.ones((1,32,32,3))/2 # for keras
y = torch.ones((1,3,32,32))/2 # for pytorch
pt_model.eval()
print(pt_model(y))
print(model(x))
# output:
# [[0.1495, 0.0770, 0.0740, 0.0601, 0.1135, 0.1814, 0.1109, 0.0918, 0.0777, 0.0640]] # pytorch
# [[0.13737796 0.0906269 0.12652941 0.08460842 0.11374795 0.10845644 0.07447342 0.0781411 0.10338538 0.08265299]] # keras
But when I tried to remove the conv layer (the model is only a single dense layer), the transfer worked, and the result showed the same value (slightly different in precision). Can someone help me to find out what’s the problem with the transfer process?
Here I provide the code in Colab Google Colab