RuntimeError: Function 'MulBackward0' returned nan values in its 0th output

You should check the optimizers doc in details as there might be differences! Momentum in particular IIRC