Problem in pytorch tutorial [NEURAL NETWORKS]

huiqing · July 19, 2019, 4:20am

I’m new here to pytorch.
when I follow the tutorial NEURAL NETWORKS,I found it’s hard to understand the operation self.fc1 = nn.Linear(16*6*6, 120).
I think it must be self.fc1 = nn.Linear(16*(5*5+1), 120),which means (channels_num*( * kernel_width * kernel_height + bias).which is the right one?
Or, how do we calculate the bias when we do the affine operation?
thanks!

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 3x3 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 6 * 6, 120)  # 6*6 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

tom · July 19, 2019, 7:29am

The required input size of the linear layer depends on the input size that will be fed. If you have 16 channels and a 6*6 image as the comment suggests, that would be the size you need. If you feed smaller images, that would give you smaller inputs.
You can always try these things and you’ll get an error message giving you the right number (although not the calculation). Or you could just print x.shape before and after the `x = x.view(…)``line.

Best regards

Thomas

huiqing · July 20, 2019, 1:19am

Thank you!
The input pictures are 5x5 of each channel before the full connnection.
However, the code in tutorial are nn.Linear(16*6*6, 120).
I think it must be nn.Linear(16*5*5, 120),or nn.Linear( 16*(5*5+1), 120)(consider the bias)
which is the right one?

tom · July 20, 2019, 6:18am

Bias doesn’t count here. Did you print the shape or get an error message?

huiqing · July 20, 2019, 12:13pm

I think my trouble may have been solved.
In the tutorial and github page, they all write like this nn.Linear(16*6*6, 120).
However, they write nn.Linear(16*5*5, 120) in the colab.
And I can run the code in colab with no error raised.
Obviously,there are some thing wrong with the code in the tutorial and github of PyTorch
Thank you anyway!

Nuno-Mota · April 22, 2020, 2:11pm

I believe huiquing might be right.

In the network overview image, there appears to be a mismatch between the displayed output dimensions of the features (at each step, up to the fully connected layer) and the actual feature dimensions you get with the code presented in the tutorial (I haven’t checked the colab).

For example, the C1 feature maps dimensions should be 6@30*30 (and not 6@28*28, as in the image) for an input image of 32*32 and Conv2d layer with 6 output channels, a 3*3 kernel (and padding=0 and stride of (1,1), as per default parameter values). The same applies for the subsequent features (at each step, up to the fully connected layer). The code ends up working, but this might be misleading for someone reading the tutorial.

It seems that tutorial is referencing Pytorch version 1.4, though, so maybe it has already been corrected in version 1.5?

datadote · June 2, 2020, 6:44pm

I agree. I think the picture is misleading. The first C1 map should be 6 @ 30x30.

BTW, the tutorial has 5x5 kernels whereas the tutorial webpage has 3x3 kernels.