Sparse Adam Optimizer with weight decay

I was going through the optimizers available in Pytorch and I found that Sparse Adam optimizer doesn’t include weight decay. Is there a vital piece of theory I’m missing here or is it possible to implement this?