Why does the Linear module seems to do unnecessary transposing?

L0SG · September 5, 2017, 4:03am

I was also thinking about this, and found this issue:

Efficient forward pass in nn.Linear

opened 07:05AM - 20 Jul 17 UTC

closed 07:26AM - 20 Jul 17 UTC

Hi, Maybe we can improve the operation speed in `nn.Linear`. `nn.Linear` …creates a weight matrix of shape (out_features, in_features) as you can see [here](https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/linear.py#L39). Then, as you can see in [F.linear](https://github.com/pytorch/pytorch/blob/master/torch/nn/functional.py#L550-L558), input matrix is multiplied by the transpose of the weight matrix. I think the transpose operation is unnecessary. What if we create a weight matrix of shape (in_features, out_features)? Then, the input matrix can be multiplied by the weight matrix (we don't need to calculate transpose of the matrix). Does this sound right? If i'm right, can i do pull request? Thanks

From what i understand, transposing in forward pass has no overhead. But backward pass will be less efficient if

input.matmul(weight)