RuntimeError: shape '[-1, 576]' is invalid for input of size 12800

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

train_data = datasets.MNIST('', train=True, transform=transforms.Compose(
    [transforms.ToTensor()]), download=True)

test_data = datasets.MNIST('', train=False, transform=transforms.Compose(
    [transforms.ToTensor()]), download=True)

train_loader = DataLoader(train_data, batch_size=32)
test_loader = DataLoader(test_data, batch_size=32)

class Network(nn.Module):
    def __init__(self):
        super(Network, self).__init__()
        self.conv = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        self.fc = nn.Linear(16*6*6, 128)
        self.output = nn.Linear(128, 10)

    def forward(self, x):
        x = F.max_pool2d(F.relu(self.conv(x)), (2, 2))
        x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2))
        x = x.view(-1, 16*6*6)
        x = F.relu(self.fc(x))
        return F.softmax(self.output(x), dim=1)

net = Network()
optimizer = optim.Adam(net.parameters(), lr=0.001)
loss_function = nn.CrossEntropyLoss()

for epoch in range(3):
    running_loss = 0.0
    for i, (X, y) in enumerate(train_loader):
        optimizer.zero_grad()
        y_pred = net(X)
        loss = loss_function(y_pred, y)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 2000 == 1999:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

The tensor shape after x = F.max_pool2d(F.relu(self.conv2(x)), (2, 2)) will be batch_size x 16 x 5 x 5.
Hence while reshaping x = x.view(-1, 16*6*6), the error is thrown.

Change self.fc = nn.Linear(16*6*6, 128) to self.fc = nn.Linear(16*5*5, 128)
and x = x.view(-1, 16*6*6) to x = x.view(-1, 16*5*5)

Thanks you for the solution! However, I want to know how to get the output shape of each layer. How do you find the output shape of each layer?

You can simply calculate a convolution layer’s spatial dims using the standard formula
(W - F + 2P)/S + 1 where W can be width/height, F is filter size, P is the padding and S is the stride.
This helps you in choosing the appropriate dimensions for the following nn.Linear layer.

You can also get the output shape by simply printing the torch.Tensor.size() inside the forward method.

I think I got it now. Thanks!