Tensor.add confusing in optimizer code

I am looking into the adam optimizer code and I ran into the following code.

if group['weight_decay'] != 0:
    grad = grad.add(group['weight_decay'], p.data)

It is clear that weight_decay and p.data are muiltiplied and then added to the grad. However, checking the documentation, I cannot see the add method which does this.

2 Likes

It’s the second entry:
https://pytorch.org/docs/stable/torch.html?highlight=add#torch.add
torch.add(input, value=1, other, out=None)

But we should really make a note of it in the tensor.add docs.

3 Likes

Ah, I did not notice that. Thanks