CNN results negative when using log_softmax and nll loss

ptrblck · April 25, 2018, 2:55pm

I tried your model using a fake dataset:

torchvision.datasets.FakeData(
    size=120,
    image_size=(3, 40, 40),
    num_classes=2,
    transform=transform)

and can overfit in ~20 epochs, if I use a weight initialization:

def weight_init(m):
    if isinstance(m, nn.Conv2d) or isinstance(m, nn.Linear):
        nn.init.xavier_uniform(m.weight.data)
        m.bias.data.zero_()

Without the weight init it takes ~40 epoch to get a zero resubstitution error.

If you have an imbalanced dataset, you could use WeightedRandomSampler.

Yes, it’s ok to print the probabilities using torch.exp as long as you don’t use it to calculate the loss etc.

spacemeerkat · April 25, 2018, 3:12pm

I assume the weight_init goes into the model class itself? Or during the training loop?

I’ll give it a try with a fake dataset like you have done and look into the WeightedRandomSampler class as well as that will come in useful!

Thank you for helping me get this far by the way

ptrblck · April 25, 2018, 3:16pm

No worries!

You should apply the weight_init after model creation:

model = Net()
model.apply(weight_init)

spacemeerkat · April 26, 2018, 9:30am

Hi ptrblck,

Just letting you know that I tried the model again with ~20 epochs and it converged just like you said! Thank you so much for helping me with this problem.