Sparse Adam Optimizer with weight decay

anirudh_sharma · August 30, 2019, 7:04pm

I was going through the optimizers available in Pytorch and I found that Sparse Adam optimizer doesn’t include weight decay. Is there a vital piece of theory I’m missing here or is it possible to implement this?