In the simple nn module as shown below, the shape of the weights associated with fc1, i.e. W1, is (128 x 784).
Assuming the mini batch size is 64, so the shape of the input X is (64, 784).
Does this mean that under the hood the weighted sum calculation inside fc1 is carried out as the dot product between input X (shape: 64 x 784) and the transpose of W1 (784 x 128) to obtain an output for fc1 with a shape of (64 x 128)?
from torch import nn
import torch.nn.functional as F
class myNN(nn.Module):
def __init__(self):
super().__init__()
# the first hidden layer
self.fc1 = nn.Linear(784, 128)
# the second hidden layer
self.fc2 = nn.Linear(128, 64)
# the output layer
self.fc3 = nn.Linear(64, 10)
def forward(self, x):
# the 1st hidden layer
x = F.relu(self.fc1(x))
# the 2nd hidden layer
x = F.relu(self.fc2(x))
# the output layer
x = F.softmax(self.fc3(x), dim=1)
return x
# create the nn object
model = myNN()
print('fc1 weights shape:\n', model.fc1.weight.shape)
The output is:
fc1 weights shape:
torch.Size([128, 784])