Determining last linear layer dimenions automatically

Rahul_Seetharaman · July 27, 2020, 9:56am

class CNN(nn.Module):
    def __init__(self,in_channels=1,num_classes=10):
        super(CNN,self).__init__()
        self.conv1=nn.Conv2d(
            in_channels=1,
            out_channels=8,
            kernel_size=(3,3),
            stride=(1,1),
            padding=(1,1))
        self.pool=nn.MaxPool2d(kernel_size=(2,2),stride=(2,2))
        self.conv2=nn.Conv2d(
            in_channels=8,
            out_channels=16,
            kernel_size=(3,3),
            stride=(1,1),
            padding=(1,1))
        self.num_classes=num_classes
        self.fc1=nn.Linear(16*7*7,num_classes)
    def forward(self,X):
        X=F.relu(self.conv1(X))
        X=self.pool(X)
        X=F.relu(self.conv2(X))
        X=self.pool(X)
        X=X.reshape(X.shape[0],-1)
        print(X.shape)
        self.fc1=nn.Linear(X.shape[1],self.num_classes).to(device=device)
        X=self.fc1(X)
        return X
model=CNN().to(device=device)
x=torch.randn(64,1,28,28).to(device=device)
input_size=784
num_classes=10
learning_rate=0.001
batch_size=64
num_epochs=1

train_dataset=datasets.MNIST(root="datasets/",train=True,transform=transforms.ToTensor(),download=True)
train_loader=DataLoader(dataset=train_dataset,batch_size=batch_size,shuffle=True)
test_dataset=datasets.MNIST(root="datasets/",train=False,transform=transforms.ToTensor(),download=True)
test_loader=DataLoader(dataset=test_dataset,batch_size=batch_size,shuffle=True)
criterion=nn.CrossEntropyLoss()
optimizer=optim.Adam(model.parameters(),lr=learning_rate)

for epoch in range(num_epochs):
    for batch_idx,(data,targets) in enumerate(train_loader):
        data=data.to(device=device)
        targets=targets.to(device=device)
        # forward propagation
        scores=model(data)
        loss=criterion(scores,targets)
        # backward propagation
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

I have written the following code for MNIST classification. This gives really poor accuracy of 0.09.
But when I define the last Linear layer inside the constructor instead by giving dimensions 16 X 7 X 7 , then the accuracy is 0.96. I felt it was rather strange. What could be the reason for this. ?

Unity05 · July 27, 2020, 12:28pm

Hi @Rahul_Seetharaman,

If I’ve understood your question correctly, you first defined the linear layer inside the forward() method, and after seeing the bad accuracy, you defined it in the constructor, what gave you a much better result?
If yes, everything that has trainable parameters should be defined in the constructor. By defining it in the forward method, you define your self.fc1 new every iteration, therefore training the layer will not make any difference as you override it in the next iteration.
I hope I could help you.

Regards,
Unity05