Is the weighted sum calculated by transposing the weights associated with a linear layer under the hood in a torch nn model?

In the simple nn module as shown below, the shape of the weights associated with fc1, i.e. W1, is (128 x 784).

Assuming the mini batch size is 64, so the shape of the input X is (64, 784).

Does this mean that under the hood the weighted sum calculation inside fc1 is carried out as the dot product between input X (shape: 64 x 784) and the transpose of W1 (784 x 128) to obtain an output for fc1 with a shape of (64 x 128)?

from torch import nn
import torch.nn.functional as F 

class myNN(nn.Module):
    def __init__(self):
        # the first hidden layer
        self.fc1 = nn.Linear(784, 128)
        # the second hidden layer
        self.fc2 = nn.Linear(128, 64)
        # the output layer 
        self.fc3 = nn.Linear(64, 10)
    def forward(self, x):
        # the 1st hidden layer
        x = F.relu(self.fc1(x))
        # the 2nd hidden layer 
        x = F.relu(self.fc2(x))
        # the output layer 
        x = F.softmax(self.fc3(x), dim=1)
        return x

# create the nn object
model = myNN()

print('fc1 weights shape:\n', model.fc1.weight.shape)

The output is:

fc1 weights shape:
 torch.Size([128, 784])

The forward method in nn.Linear is defined as y = xW^T + b, so yes the weight matrix is transposed in the operation. You can read more in the docs,

1 Like