Suboptimal convergence when compared with TensorFlow model

hrdeepak · April 19, 2020, 10:01am

In my experiment, however, I followed these to and ended up with similar results:

Used nn.init.xavier_uniform_ for weights and nn.constant_ for the biases.
In the adam optimizer, PyTorch uses default eps=1e-8 vs TensorFlow’s epsilon=1e-7.Changed it to 1e-7

Hope this helps