Confusion about parameters of torch.addmm(...)

torch_rookie · November 12, 2018, 11:41am

following is the definition of torch.addmm 's parameters:
torch. addmm ( beta=1 , mat , alpha=1 , mat1 , mat2 , out=None ) → Tensor
from torch — PyTorch 2.1 documentation
but under it there is an example:

M = torch.randn(2, 3)
mat1 = torch.randn(2, 3)
mat2 = torch.randn(3, 3)
torch.addmm(M, mat1, mat2)
tensor([[-4.8716, 1.4671, -1.3746],
[ 0.7573, -3.9555, -2.8681]])
the call of torch.addmm in the example only uses 3 parameters, when addmm’s last 2 parameters don’t heve a default value.
This is all my confusion about it.

hamzzy · November 12, 2018, 12:08pm

What the M does is to add the random number generated by the 2 by 3 matrix while the mat1 and mat2 are multiply together
that
it will add mat1 and mat2 and multiply the final matrix by M
and the M rep the mat in the addmm

torch_rookie · November 12, 2018, 12:37pm

But what I see from pytorch doc is that " torch.addmm( beta=1 , mat , alpha=1 , mat1 , mat2 , out=None )" in which M，mat1 and mat2 should be seen as beta, mat and alpha. I run the example on my pc, it just works as you say, but I don’t understand why it works. Please cure my confusion. Thanks

hamzzy · November 12, 2018, 12:48pm

alpha and beta are only scaling value in the vec matrix

hamzzy · November 12, 2018, 12:51pm

aplha scale the value mul of mat1 and mat2 while beta scale the mat value

torch_rookie · November 12, 2018, 12:56pm

I know what torch.addmm does. But I don’t know how it can be like this.
let me take an example. why torch.addmm(1,M,1,mat1,mat2) and torch.addmm(M,mat1,mat2) can work the same? why the M in the second call can be parsed as mat instead of beta while the M in the second call is the first parameter？ Please forgive my English…

hamzzy · November 12, 2018, 12:58pm

Because beta and alpha have positional argument and have a input value 1 from the function.

torch_rookie · November 12, 2018, 1:07pm

To my knowledge, the non-default parameter can’t follow the default parameter in python. And I’m also confused about the definition of addmm in pytoch doc, where the mat follows the beta=1. And this question is closely related to what we talk about.

hamzzy · November 12, 2018, 1:14pm

sure,bt in that case alpha already have an input in the argument
let take for instance a function to calulate neural network

> def nn(weight,bias=1,features):
> 
>    return weight*feature+bias

so if you use the function note the bias in the function is given a default value 1 so it allow u to input weight and features without worring about bias except u want to change bias

torch_rookie · November 12, 2018, 1:36pm

My python version is 3.6.6. My python don’t allow it. And does the ‘nn’ function you just define work on your device?

hamzzy · November 12, 2018, 1:44pm

just using it to explian i did’t mean u should test

hamzzy · November 12, 2018, 1:45pm

do u understand argument of a function very well

tom · November 12, 2018, 1:48pm

What happens here is that the addmm does have “overloads” to implement the behaviour that, as you correctly note, would not be possible using a single plain Python function.
The twist is that the one using keyword only alpha and beta arguments is the preferred one (defined in aten/src/ATen/native/native_functions.yaml) while the others are deprecated (defined in tools/autograd/deprecated.yaml).
It is, perhaps, unfortunate that the addmm example uses parameters that are marked as deprecated.

Best regards

Thomas

(who looked at this way too much when generating typehints )

torch_rookie · November 12, 2018, 1:55pm

Other has explained to me that the definition isn’t what’s shown in pytorch doc. Any way, thanks

torch_rookie · November 12, 2018, 1:57pm

Though I can’t understand the content in the link, it’s enough for me to know the the definition isn’t what’s shown in pytorch doc. Thank you vvvvery much!