Hi!
So while simply running this official example:
from __future__ import print_function
import argparse
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
from torch.autograd import Variable
# Training settings
parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
parser.add_argument('--batch-size', type=int, default=64, metavar='N',
help='input batch size for training (default: 64)')
parser.add_argument('--test-batch-size', type=int, default=1000, metavar='N',
help='input batch size for testing (default: 1000)')
parser.add_argument('--epochs', type=int, default=10, metavar='N',
help='number of epochs to train (default: 10)')
parser.add_argument('--lr', type=float, default=0.01, metavar='LR',
help='learning rate (default: 0.01)')
parser.add_argument('--momentum', type=float, default=0.5, metavar='M',
This file has been truncated. show original
that’s the output from the last epoch:
Train Epoch: 10 [55040/60000 (92%)] Loss: 0.185965
Train Epoch: 10 [55680/60000 (93%)] Loss: 0.099982
Train Epoch: 10 [56320/60000 (94%)] Loss: 0.271109
Train Epoch: 10 [56960/60000 (95%)] Loss: 0.049256
Train Epoch: 10 [57600/60000 (96%)] Loss: 0.384411
Train Epoch: 10 [58240/60000 (97%)] Loss: 0.182649
Train Epoch: 10 [58880/60000 (98%)] Loss: 0.374920
Train Epoch: 10 [59520/60000 (99%)] Loss: 0.307496
Test set: Average loss: 0.0486, Accuracy: 9838/10000 (98%)
So as you can see, the average loss on the test set is about an order of magnitude less than the loss on the training set.
That is impossible and should be the other way around (it should either overfit or otherwise be roughly the same).
What’s going on and where is the bug? (I literally ran the official example unchanged).
1 Like
jpeg729
(jpeg729)
March 25, 2018, 7:53am
2
During training the model uses Dropout, at test time it doesn’t. Dropout will make the training loss worse.
If you remove the dropout layer then the two losses should be more similar.
2 Likes
Ah didn’t notice the dropout unit,thanks!