I’m interested in SWA so I’m trying to use it, but I don’t know what to use. I don’t know the difference between the following blogs and docs swa.
which one is the lastest version of SWA torch.optim.utils_swa or torchcontrib.SWA and what is the difference?
blog: https://pytorch.org/blog/stochastic-weight-averaging-in-pytorch/
docs : https://pytorch.org/docs/stable/optim.html