I am new to PyTorch, so I am trying to learn from many different examples and sources, so please excuse any glaring inconsistencies. Anyways, I am trying to implement an AutoEncoder for the Fashion-MNIST dataset, and then take the trained output of the bottleneck layer to send to an MLP classifier. The AE was not too troubling, but adding the classifier is where I am running into issues.
My MLP output layer has 10 nodes (corresponding to 10 classes), and the MNIST labels are integers by default rather than one-hot codes. Based on what I have read so far, it seems that this should be no problem for the nll_loss function or cross_entropy loss (which to my understanding calls nll_loss). However, in my case this is not so, as I get the following error when attempting to calculate the cross_entropy loss between my 10-dim predictions and the integer target labels:
Traceback (most recent call last):
File "sae_fmnist.py", line 194, in <module>
test_SAE(model, test_set, test_bs) # see initial perf with random weights
File "sae_fmnist.py", line 142, in test_SAE
test_loss_c += F.cross_entropy(pred, target)
File "/home/ryan/.local/lib/python3.5/site-packages/torch/nn/functional.py", line 2009, in cross_entropy
return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)
File "/home/ryan/.local/lib/python3.5/site-packages/torch/nn/functional.py", line 1848, in nll_loss
out_size, target.size()))
ValueError: Expected target size (1, 10), got torch.Size([1])
My model is defined as follows:
class Autoencoder(nn.Module):
def __init__(self):
super(Autoencoder,self).__init__()
self.encoder = nn.Sequential(
nn.Linear(in_features=28*28, out_features=500),
nn.ReLU(inplace=True),
nn.Linear(in_features=500, out_features=200),
nn.ReLU(inplace=True),
nn.Linear(in_features=200, out_features=20),
nn.ReLU(inplace=True)
)
self.decoder = nn.Sequential(
nn.Linear(in_features=20, out_features=200),
nn.ReLU(inplace=True),
nn.Linear(in_features=200, out_features=500),
nn.ReLU(inplace=True),
nn.Linear(in_features=500, out_features=28*28)
)
self.classifier = nn.Sequential(
nn.Linear(in_features=20, out_features=10)#,
#nn.LogSoftmax(dim=1)
)
# define forward function
def forward(self, x):
encoded = self.encoder(x)
decoded = self.decoder(encoded)
pred = self.classifier(encoded)
return decoded, pred
And the erroneous function is here:
def test_SAE(model, dataset, batch_size):
model.eval() # puts model in evaluation mode
test_loss_d = 0
test_loss_c = 0
num_correct = 0
num_digits = 10
test_loader = torch.utils.data.DataLoader(dataset)
with torch.no_grad():
for data, target in test_loader:
data = data.flatten(start_dim=2)
output, pred = model(data)
criterion_d = torch.nn.MSELoss(size_average=False)
#criterion_c = torch.nn.CrossEntropyLoss(size_average=False)
test_loss_d += criterion_d(output, data).item()
#test_loss_c += criterion_c(pred, target).item()
test_loss_c += F.cross_entropy(pred, target)
#p_class = pred.data.max(1, keepdim=True)[1]
#num_correct += p_class.eq(target.data.view_as(p_class)).sum()
num_correct += pred.argmax(dim=1).eq(target).sum().item()
test_loss_d /= len(test_loader.dataset)
test_loss_c /= len(test_loader.dataset)
print('\nTest set: Avg. loss: {:.4f}, {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss_d, test_loss_c, num_correct, len(test_loader.dataset),
100. * num_correct / len(test_loader.dataset)))
Some things of note:
-I have left a few comments of other attempted methods which too produced similar errors
-The dual output of the network is based on this AE: Autoencoder and Classification inside the same model
-I call test once before any training, which is why I get errors in my test function. I expect the same errors in my similar training function, but they should correspond to whatever the problem is here, I assume.
Any suggestions would be appreciated. Thanks!