In the official EfficientNet paper, I read:
We train our EfﬁcientNet models on ImageNet using similar settings as (Tan et al., 2019): RMSProp optimizer with decay 0.9 and momentum 0.9; batch norm momentum 0.99; weight decay 1e-5; initial learning rate 0.256 that decays by 0.97 every 2.4 epochs
Leaving aside the scheduler optimizer, how can I create the same optimizer in PyTorch? Is
alpha the decay? I am confused by the name of the parameters of
RMSprop in PyTorch.
Is this correct?
torch.optim.RMSprop( self.parameters(), lr=self.learning_rate, alpha=0.9, momentum=0.9, weight_decay=1e-5 )