What is the difference between passing through a linear layer and using `nn.Parameter`?

seankala · March 7, 2020, 6:50am

Hello. This may be a bit of an elementary question, but I was having trouble figuring out the nuts and bolts of things.

I’m currently trying to implement a neural network model, and in the original paper there is something about performing matrix multiplication with a layer-specific weight matrix.

Until now, when I perform that operation I used torch.nn.Linear, but I noticed that the code implementation used torch.nn.Parameter instead and performed matrix multiplication explicitly.

What is the difference between these two modules? I initially thought they were the same thing, but something’s telling me that’s not the case. Thanks.

G.M · March 7, 2020, 9:03am

By default, nn.Linear has a bias, which means it also performs a matrix addition after the matrix multiplication(y=Wx+B). U can turn the “bias” off by specifying bias=False; this makes nn.Linear equivalent to the matrix multiplication(y=Wx).

seankala · March 7, 2020, 9:05am

Hi, thanks for the reply. So if I set the bias argument to False and run data through a nn.Linear layer, that would be the equivalent of performing matrix multiplication of the data with an nn.Parameter?

G.M · March 7, 2020, 9:14am

Yes. In fact, if u print the code of a TouchScript involving a Linear layer, you’ll actually see that in TouchScript, nn.Linear is expressed as torch.addbmm.