Suboptimal convergence when compared with TensorFlow model

The same.

I set optim.lr_scheduler.StepLR(optimizer, step_size=50, gamma=0.2) for pytorch, and luckily the performance is comparable with tensorflow.

And I do not know the reason.