Loss decrease different between the pytorch and keras

I meet a problem when I rewrite the network form keras
in keras, i use
optimizer = keras.optimizers.Adam(lr = 1e-4)
model.compile(optimizer=optimizer, loss=‘binary_crossentropy’, metrics=[‘accuracy’])
in keras,and use
optimizer = optim.Adam(model.parameters(),lr=0.0001)
in pytorch
but keras’s loss can keep decrease util 0.04,but in pytroch ,it will stuck at 0.2
I use same data in both and BCELoss in pytorch , how can I fix that problem?

Often the discrepancy is due to different parameter initializations.
Could you make sure to use the same torch.nn.init methods for your layers as in your Keras model?

I use he_normal to initialize the layers in keras,so what should I use in pytorch?

torch.nn.init.kaiming_normal corresponds to Keras’ he_normal.

