Why don't we add AdamW to the official optimizer sets?

Hi all,

I’m usually using AdamW optimizer implemented by egg-west, since it is obviously and definitely effective when I train models. So I wonder why PyTorch doesn’t include AdamW or SGDR in our official optimizer sets. Is there any specific reason that AdamW or SGDR has some unclear issues in theory or in their implementation?

Thanks,
Jinserk

1 Like

https://www.fast.ai/2018/07/02/adam-weight-decay/

This article says “fast.ai” is only the library that implemented the fix.

I found a PR for AdamW.

(Edit)

I think this PR is better than the above one.

TORCH.OPTIM.LR_SCHEDULER.CosineAnnealingWarmRestarts is a scheduler for SGDR.

Thanks @Tony-Y for the informations! I didn’t know the PR exists and am surprised that it has been more than a year to verify. I knew that the Fast.ai’s solution but wanted to ask why the official PyTorch doesn’t have one since the AdamW is quite effective. I’ll follow up the PRs as well. Thank you again!