Your code looks generally good!
Could you try to apply the same weight initializations that are used in Keras to compare the models?
Here is a small example.
Also, could you post the Keras code, as there still might be some small differences?
Some minor issue:
-
Variables
are deprecated and you can usetensors
directly since PyTorch0.4.0
- It’s generally recommended to call the model directly instead of
forward
. You could changeself.forward(x)
toself(x)
.