Suboptimal convergence when compared with TensorFlow model

(Kai Arulkumaran) #21

There are many factors that can cause differences. Some people have reported things to try here.

(Jeff) #22

Same problem here. Cannot replicate TF Adam optimizer success in Pytorch.

Edit: Disregard. I’m actually getting better loss in Pytorch over TF with Adam now that I’m actually taking the mean of my losses.
size_average=False found in jcjohnson’s github examples can make for a long night for a newbie.