In the simple nn module as shown below, the shape of the weights associated with fc1, i.e. W1, is (128 x 784).
Assuming the mini batch size is 64, so the shape of the input X is (64, 784).
Does this mean that under the hood the weighted sum calculation inside fc1 is carried out as the dot product between input X (shape: 64 x 784) and the transpose of W1 (784 x 128) to obtain an output for fc1 with a shape of (64 x 128)?
from torch import nn import torch.nn.functional as F class myNN(nn.Module): def __init__(self): super().__init__() # the first hidden layer self.fc1 = nn.Linear(784, 128) # the second hidden layer self.fc2 = nn.Linear(128, 64) # the output layer self.fc3 = nn.Linear(64, 10) def forward(self, x): # the 1st hidden layer x = F.relu(self.fc1(x)) # the 2nd hidden layer x = F.relu(self.fc2(x)) # the output layer x = F.softmax(self.fc3(x), dim=1) return x # create the nn object model = myNN() print('fc1 weights shape:\n', model.fc1.weight.shape)
The output is:
fc1 weights shape: torch.Size([128, 784])