Linear NN for MNIST

xzlhqed · May 18, 2020, 1:19pm

Hi

I apologise for the basic question. I am trying to implement a linear NN (linear activation, MSE error minimisation) to train MNIST on and I am having difficulties that I haven’t been able to resolve from googling. My model was initially defined as follows

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28*28, 50)
        self.fc1_drop = nn.Dropout(0.2)
        self.fc2 = nn.Linear(50, 50)
        self.fc2_drop = nn.Dropout(0.2)
        self.fc3 = nn.Linear(50, 10)

    def forward(self, x):
        x = x.view(-1, 28*28)
        x = self.fc1_drop(x)
        x = self.fc2_drop(x)
        return F.log_softmax(self.fc3(x), dim=1)

model = Net().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
criterion = nn.MSELoss()

but when I tried training I got an error from using MSELoss because the output of the neural net (for batch size of 32) has size (32, 10) and the target has size (32). I tried using target.view(1, -1).float(), and the model ran, but I got nonsensical values for the loss.

I changed the model so that self.fc3 = nn.Linear(50, 1) to return a single value, but when I train the network every value returned is 0. I understand that for MSELoss to work, the tensors must match in trailing dimensions, but I am currently struggling to figure out how to make it actually work for MNIST.

satyajitghana · May 18, 2020, 2:24pm

try printing out the output of the model and the target, i think the model is outputing probabilities of each of the possible number [1-10] , you’ll have to do i convert the target to one hot and then apply a loss function,

also take a look here (uses the MNIST dataset from pytorch) https://github.com/AvivSham/Pytorch-MNIST-colab/blob/master/Pytorch_MNIST.ipynb