PyTorch Tutorial: Loss does not decrease

PhysicsIsFun · January 20, 2021, 5:08pm

Greetings everyone,

I am now to PyTorch and to this board, and I hope I can get a little help here and there

I just started doing the PyTorch 60 minutes Blitz tutorial, and I noticed that in one of their examples, after calling an SGD optimizer, the loss of a pretrained model did not decrease:

import numpy as np
import torch, torchvision

# %% Downloading stuff
model = torchvision.models.resnet18(pretrained=True)

# %% Rest
data = torch.rand(1, 3, 64, 64)
labels = torch.rand(1, 1000)

predic = model(data)
loss = (predic - labels).sum()
print(loss)  # loss negative
loss.backward()

optim = torch.optim.SGD(model.parameters(), lr=1e-3, momentum=0.9)
optim.step()
predic = model(data)
loss = (predic - labels).sum()
print(loss)  # loss even more negative, i.e. larger abs value

This was basically done here:
https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html
Should the absolute loss not decrease after a single optimization step when having only one data point? Does it have to do with the fact that the model is pretrained?

Best,
PhysicsIsFun

pascal_notsawo · January 20, 2021, 5:18pm

This may be due to the fact that you have used a bad loss function, such as pytorch-model-not-converging.

PhysicsIsFun · January 20, 2021, 5:20pm

Well I used what they used, which means the tutorial is not that great.
Anyway, when squaring the (predic - loss), I get the mean-squared-error, and then the loss decreases just fine. So this seemed to be the problem.

pascal_notsawo · January 20, 2021, 5:55pm

Are you making a classification? A regression? Or something else?

PhysicsIsFun · January 20, 2021, 10:11pm

It’s what they did. Classification with real output (sigmoidal?) and 1000 labels.
But seems to work now. Cheers!