How to average weights from different saved epochs?

I’m training a model and want to load the last three saved checkpoints and do an average/mean of the weights and save it into one new average model/weight, all are from the same architecture and trained data, idea?

I can easily do it in Keras/TF, more experience in that area but need someone here who is more experienced in Pytorch.

A simple way to go about this would be to load each checkpoint in succession, add the parameter values of each into appropriately sized tensors, and then divide by 3 to get the mean.

A simple 1 layer example would be:

layer_1 = 0
for param in model.named_parameters():
    if param[0] == 'fc.weight':
        layer_1 = torch.zeros_like(param[1].data)

Then for each checkpoint do:

for param in model.named_parameters():
    if param[0] == 'fc.weight':
        layer_1 += param[1].data

Now divide layer_1 by three, create a new model instance, and run:

for param in model.named_parameters():
    if param[0] == 'fc.weight':
        param[1].data = layer_1

There is probably a much more elegant way to do this, but this is what comes to mind for me.

I found an already implemented technique for it – ‘Stochastic Weight Averaging’ and it works!