Setting learning rate for Stochastic Weight Averaging

thorstenwagner · May 8, 2024, 9:23am

Hi there,

I’m experimenting with stochastic weight averaging.

What I have done so far:

Initialized the network with my current best model
Reduced the learning rate I used for my current best model by one order of magnitude.
Train for 20 epochs using EMA weighting with the following setup:
AveragedModel(self.model, multi_avg_fn=get_ema_multi_avg_fn(0.999))

However, this strategy did not improve the model.

Now I’m wondering if the strategy I chose makes sense. My questions are

I would be very grateful if someone could share their experience

Thanks a lot!
Thorsten