woofie56
(Woofie56)
1
In :
http://pytorch.org/docs/torch.html#torch.addmm
we have :
torch.addmm(beta=1, mat, alpha=1, mat1, mat2, out=None)
out = (beta * M) + (alpha*mat1 @ mat2)
In torch.nn.Linear
we have :
output.addmm_(0, 1, input, weight.t())
So is it the case that :
- 0 in
output.addmm_(0, 1, input, weight.t())
is beta
- 1 in
output.addmm_(0, 1, input, weight.t())
is M
or mat
- So
out = alpha*mat1 @ mat2
? Thanks
Notice that addmm_
is the in-place version of addmm
, and it’s a bound method.
output.addmm_(0, 1, input, weight.t())
is actually translated to torch.Tensor.addmm_(0, output, 1, input, weight.t())
.
The fact that is an in-place method implies that the result will be written to the same tensor output
the method was called with.
Therefore what happens there is that output
becomes 0 * output + 1 * input @ weight.t()
4 Likes