Hi
I apologise for the basic question. I am trying to implement a linear NN (linear activation, MSE error minimisation) to train MNIST on and I am having difficulties that I haven’t been able to resolve from googling. My model was initially defined as follows
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.fc1 = nn.Linear(28*28, 50)
self.fc1_drop = nn.Dropout(0.2)
self.fc2 = nn.Linear(50, 50)
self.fc2_drop = nn.Dropout(0.2)
self.fc3 = nn.Linear(50, 10)
def forward(self, x):
x = x.view(-1, 28*28)
x = self.fc1_drop(x)
x = self.fc2_drop(x)
return F.log_softmax(self.fc3(x), dim=1)
model = Net().to(device)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)
criterion = nn.MSELoss()
but when I tried training I got an error from using MSELoss because the output of the neural net (for batch size of 32) has size (32, 10) and the target has size (32). I tried using target.view(1, -1).float(), and the model ran, but I got nonsensical values for the loss.
I changed the model so that self.fc3 = nn.Linear(50, 1)
to return a single value, but when I train the network every value returned is 0. I understand that for MSELoss to work, the tensors must match in trailing dimensions, but I am currently struggling to figure out how to make it actually work for MNIST.