I read on ADAM optimizer, and I saw multiple quotes which say that ADAM is a combination of Momentum and RMSprop optimizers.
So if we:
- Set β1 = 0 does it means that ADAM behaves exactly as RMSprop optimizer?
- Set β2 = 0 does it means that ADAM behaves exactly as Momentum optimizer?