Hello. This may be a bit of an elementary question, but I was having trouble figuring out the nuts and bolts of things.
I’m currently trying to implement a neural network model, and in the original paper there is something about performing matrix multiplication with a layer-specific weight matrix.
Until now, when I perform that operation I used torch.nn.Linear, but I noticed that the code implementation used torch.nn.Parameter instead and performed matrix multiplication explicitly.
What is the difference between these two modules? I initially thought they were the same thing, but something’s telling me that’s not the case. Thanks.
By default, nn.Linear has a bias, which means it also performs a matrix addition after the matrix multiplication(y=Wx+B). U can turn the “bias” off by specifying bias=False; this makes nn.Linear equivalent to the matrix multiplication(y=Wx).
Hi, thanks for the reply. So if I set the bias argument to False and run data through a nn.Linear layer, that would be the equivalent of performing matrix multiplication of the data with an nn.Parameter?
Yes. In fact, if u print the code of a TouchScript involving a Linear layer, you’ll actually see that in TouchScript, nn.Linear is expressed as torch.addbmm.