I am looking into the adam optimizer code and I ran into the following code.
if group['weight_decay'] != 0:
grad = grad.add(group['weight_decay'], p.data)
It is clear that weight_decay and p.data are muiltiplied and then added to the grad. However, checking the documentation, I cannot see the add method which does this.