Are layer weights are transposed and stored for RNN?

KUSHAL_JAIN · March 9, 2022, 9:23pm

The weight (W) matrices for nn.Linear layer are stored as W.T. That is if we have 100 neurons in the input layer and 200 neurons in the next layer, our layer definition would be nn.Linear(in_features=100, out_features=200) and I would expect the weight matrix to be of shape (100,200) since for every neuron in the first layer we have 200 connections in the second layer and th e weights are “propagted” in this direction. However the weights are transposed and then stored for efficiency during backprop.
Is this behavior only restricted for nn.Linear layers or is it implemented in all nn modules. I specifically want to know if the internal weight matrices are transposed for an RNN layer. I can see that the weight_ih that is input to hidden matrix is transposed while storing but I cannot be sure about the weight_hh since it’s a square matrix. I need to know since I am updating weight manually for each connection and transposed matrices might imply that I am updating the wrong connections. Basically I want to know “which neuron led the neruon in subsequent layer to fire”.