Hello. This may be a bit of an elementary question, but I was having trouble figuring out the nuts and bolts of things.

I’m currently trying to implement a neural network model, and in the original paper there is something about performing matrix multiplication with a layer-specific weight matrix.

Until now, when I perform that operation I used torch.nn.Linear, but I noticed that the code implementation used torch.nn.Parameter instead and performed matrix multiplication explicitly.

What is the difference between these two modules? I initially thought they were the same thing, but something’s telling me that’s not the case. Thanks.

By default, nn.Linear has a bias, which means it also performs a matrix addition after the matrix multiplication(y=Wx+B). U can turn the “bias” off by specifying bias=False; this makes nn.Linear equivalent to the matrix multiplication(y=Wx).

Hi, thanks for the reply. So if I set the bias argument to False and run data through a nn.Linear layer, that would be the equivalent of performing matrix multiplication of the data with an nn.Parameter?

Yes. In fact, if u print the code of a TouchScript involving a Linear layer, you’ll actually see that in TouchScript, nn.Linear is expressed as torch.addbmm.