Suboptimal convergence when compared with TensorFlow model

recastrodiaz (Rodrigo Castro) July 22, 2017, 12:09am 2

I have had similar issues with Pytorch vs Keras, but while I haven’t found a simple answer, these are other things I would check:

Is Keras using any regularizers or constraints?
Is Keras using biases whilst PyTorch is not?
Are you computing the loss the exact same way?

400% higher error with PyTorch compared with identical Keras model (with Adam optimizer)