@ptrblck, that was awesome. After adding some list test and splitting data further into train,valid and test, it got approx. 88% accuracy.
Running on list of single batch got all correct:
# make a single prediction
print("Making prediction...")
enum_test_dl = list(enumerate(test_dl))
enum_test_dl_sub=enum_test_dl[:1]
print("enum_test_dl_sub: ", len(enum_test_dl_sub))
yhat = model(enum_test_dl_sub[0][1][0])
yhat = yhat.detach().numpy()
actual = enum_test_dl_sub[0][1][1].numpy()
#print("actual1: ", actual)
# convert to class labels
yhat = argmax(yhat, axis=1)
# reshape for stacking
#actual = actual.reshape((len(actual), 1))
#yhat = yhat.reshape((len(yhat), 1))
print("yhat: ", yhat[:10])
print("actual: ", actual[:10])
train/valid/test: torch.utils.data.dataset.Subset 50000 torch.utils.data.dataset.Subset 10000 torchvision.datasets.mnist.FashionMNIST 10000
train_dl/test_dl: torch.utils.data.dataloader.DataLoader 782 torch.utils.data.dataloader.DataLoader 157
50000 10000
train_dl: 782
epoch: 0 / 10........................................loss: tensor(0.5518, grad_fn=<NllLossBackward>)
epoch: 1 / 10........................................loss: tensor(0.3474, grad_fn=<NllLossBackward>)
epoch: 2 / 10........................................loss: tensor(0.3830, grad_fn=<NllLossBackward>)
epoch: 3 / 10........................................loss: tensor(0.3948, grad_fn=<NllLossBackward>)
epoch: 4 / 10........................................loss: tensor(0.4192, grad_fn=<NllLossBackward>)
epoch: 5 / 10........................................loss: tensor(0.1886, grad_fn=<NllLossBackward>)
epoch: 6 / 10........................................loss: tensor(0.0419, grad_fn=<NllLossBackward>)
epoch: 7 / 10........................................loss: tensor(0.1398, grad_fn=<NllLossBackward>)
epoch: 8 / 10........................................loss: tensor(0.0961, grad_fn=<NllLossBackward>)
epoch: 9 / 10........................................loss: tensor(0.3626, grad_fn=<NllLossBackward>)
Accuracy: 0.880
Making prediction...
enum_test_dl_sub: 1
yhat: [9 2 1 1 6 1 4 6 5 7]
actual: [9 2 1 1 6 1 4 6 5 7]
However, lingering questions remain, on tensorflow/keras part that I was “porting from”, softmax was there and it worked just fine. Just wondering why:
...
model=keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = [28, 28]))
model.add(keras.layers.Dense(300, activation="relu"))
model.add(keras.layers.Dense(100, activation="relu"))
model.add(keras.layers.Dense(30, activation="softmax"))
...