Hi,
I’m reading the pytorch source code,in sgd.py has this line
p.data.add_(-group['lr'], d_p)
I think this code means p=p-lr*d_p right? So my question is why add_() above can achieve lr*d_p, the multiple operation.
Thanks
Hi,
I’m reading the pytorch source code,in sgd.py has this line
p.data.add_(-group['lr'], d_p)
I think this code means p=p-lr*d_p right? So my question is why add_() above can achieve lr*d_p, the multiple operation.
Thanks
Please check the documentation for add.
The inplace version of torch.add(input, value=1, other, out=None)
is input.add_(value=1, other, out=None)
. So p.data.add_(-group['lr'], d_p)
is p.data.add_(value=-group['lr'], other=d_p)
.
I feel this kind of APIs is followed most of the vectorized operation design.