In the official EfficientNet paper, I read:
We train our EfficientNet models on ImageNet using similar settings as (Tan et al., 2019): RMSProp optimizer with decay 0.9 and momentum 0.9; batch norm momentum 0.99; weight decay 1e-5; initial learning rate 0.256 that decays by 0.97 every 2.4 epochs
Leaving aside the scheduler optimizer, how can I create the same optimizer in PyTorch? Is alpha
the decay? I am confused by the name of the parameters of RMSprop
in PyTorch.
Is this correct?
torch.optim.RMSprop(
self.parameters(), lr=self.learning_rate, alpha=0.9, momentum=0.9, weight_decay=1e-5
)