woofie56
(Woofie56)
#1
In :

http://pytorch.org/docs/torch.html#torch.addmm

we have :

```
torch.addmm(beta=1, mat, alpha=1, mat1, mat2, out=None)
out = (beta * M) + (alpha*mat1 @ mat2)
```

In `torch.nn.Linear`

we have :

`output.addmm_(0, 1, input, weight.t())`

So is it the case that :

- 0 in
`output.addmm_(0, 1, input, weight.t())`

is `beta`

- 1 in
`output.addmm_(0, 1, input, weight.t())`

is `M`

or `mat`

- So
`out = alpha*mat1 @ mat2`

? Thanks

Notice that `addmm_`

is the *in-place* version of `addmm`

, and it’s a *bound method*.

`output.addmm_(0, 1, input, weight.t())`

is actually translated to `torch.Tensor.addmm_(0, output, 1, input, weight.t())`

.

The fact that is an *in-place* method implies that the result will be written to the same tensor `output`

the method was called with.

Therefore what happens there is that `output`

becomes `0 * output + 1 * input @ weight.t()`

