Mismatched number of neurons in classification layer and number of classes

idgx · October 13, 2023, 7:38pm

class NeuralNetwork(nn.Module):
    def __init__(self):
    super().__init__() 
    self.conv1 = nn.Conv2d(1, 20, stride = 1, kernel_size = 5) # Makes 20 maps of 24x24
    self.pool = nn.MaxPool2d(2, 2)                             # Makes 20 maps of 12x12
    self.fc1 = nn.Linear(20 * 12 * 12, 100) #  <---                
    
def forward(self, x):
    x = self.pool(torch.sigmoid(self.conv1(x)))
    x = torch.flatten(x, 1) # flatten all dimensions except batch
    x = self.fc1(x)        
    return x

I’m trying to train this network with MNSIT dataset and I do not get any tensor size mismatch error!
In the last layer, I have 100 neurons instead of having 10! In my case, accuracy was 1% worse with 100 output neurons instead of 10, I also see that this issue was listed. Anyway, my concern is that why was the size mismatch not flagged ? How is autograd dealing with this situation ? Any insights that will shed light on this will be appreciated.

ptrblck · October 13, 2023, 8:27pm

Your model is allowed to output more logits than you are using. I.e. in your example logit 10-99 will contain “random” data for the unknown classes with indices 10-99. The loss calculation doesn’t fail since targets are valid as long as they are in [0, nb_classes-1]. which is the case if you artificially inflate nb_classes.

akt42 · October 14, 2023, 3:48pm

Could you show how the loss function was called and the dimensions of the predictions and labels tensors?

idgx · October 16, 2023, 4:48pm

I am using loss_fn = nn.CrossEntropyLoss() . Therotically, loss is given by -ln(Softmax(activation of neuron corresponding to the label)).

Dataset was ontained like below:

training_data = datasets.MNIST(root = "data",train = True,download = True,transform = ToTensor()) 
test_data = datasets.MNIST(root = "data",train = False,download = True,transform = ToTensor())