PyTorch comparable but worse than keras on a simple feed forward network

SimonW · November 14, 2017, 5:29pm

Comparing keras and pytorch documents of rmsprop, it seems that pytorch’s default lr us 10x as large as keras’s. Do you have some other code that changes lr somewhere?

keras: https://keras.io/optimizers/#rmsprop
pytorch: http://pytorch.org/docs/0.2.0/optim.html#torch.optim.RMSprop