PyTorch comparable but worse than keras on a simple feed forward network

Comparing keras and pytorch documents of rmsprop, it seems that pytorch’s default lr us 10x as large as keras’s. Do you have some other code that changes lr somewhere?

keras: https://keras.io/optimizers/#rmsprop
pytorch: http://pytorch.org/docs/0.2.0/optim.html#torch.optim.RMSprop