Warning appearing depending on loss function?

Arthur_Conmy · July 16, 2020, 9:28pm

I’ve been working through a PyTorch tutorial here: https://jovian.ml/aakashns/fdaae0bf32cf4917a931ac415a5c31b0 and have been trying to implement the neural network from the first chapter of http://neuralnetworksanddeeplearning.com/. One of the changes to make to the tutorial’s neural network is changing the loss function from cross entropy to MSE. After initialising the network, the tutorial runs the following

for images, labels in train_loader:
    outputs = model(images)

    print(outputs.shape,labels.shape)
    loss = F.cross_entropy(outputs, labels)
    
    print('Loss:', loss.item())
    break

print('outputs.shape : ', outputs.shape)
print('Sample outputs :\n', outputs[:2].data)

which prints

torch.Size([128, 10]) torch.Size([128])
Loss: 2.3188605308532715
outputs.shape :  torch.Size([128, 10])
Sample outputs: ...

However when running essentially the same code after changing the loss function to MSELoss:

for images, labels in train_loader:
    outputs = model(images)

    print(outputs.shape, labels.shape)

    loss_fn = nn.MSELoss()
    loss = loss_fn(outputs, labels) # Calculate loss
    
    print('Loss:', loss.item())
    break

print('outputs.shape : ', outputs.shape)
print('Sample outputs :\n', outputs[:2].data)

I get the following output (note a different batch size is used) and then warning

torch.Size([10, 10]) torch.Size([10])
Loss: 42.0984992980957
...
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/loss.py:432: UserWarning: Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 10])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

Why is this warning appearing for one loss function and not the other?

Exra: in addition, why doesn’t writing loss = nn.MSELoss(outputs, labels) rather than the loss_fn workaround I’ve been using work?

KFrank · July 17, 2020, 12:00am

Hello Arthur -

Arthur_Conmy:

    print(outputs.shape,labels.shape)
    loss = F.cross_entropy(outputs, labels)

which prints

torch.Size([128, 10]) torch.Size([128])

after changing the loss function to MSELoss:

    print(outputs.shape, labels.shape)
    loss_fn = nn.MSELoss()

I get the following output

torch.Size([10, 10]) torch.Size([10])
Loss: 42.0984992980957
...
UserWarning: Using a target size (torch.Size([10])) that is different to the input size (torch.Size([10, 10])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

cross_entropy() expects outputs of shape [nBatch, nClass] and
labels of shape [nBatch] (no nClass – each single label value is
an integer class label that runs from 0 to nClass - 1).

MSELoss expects outputs of shape [nBatch, nValues] and labels
(which really shouldn’t be called “labels” for MSELoss) also of shape
[nBatch, nValues].

I’m not really sure what the tutorial is looking for here, because if
cross_entropy() makes sense for a model, MSELoss is unlikely
to also make sense.

Having said that, you could either one-hot your integer class labels
to get labels of shape [nBatch, nClass] to match the output of
your model, or you could change your model to output a single
value (instead of nClass values) so that its output for a batch will
be of shape [nBatch] and thus match the shape of your labels.

torch.nn.MSELoss is a class, while torch.nn.functional.mse_loss()
is a function. Thus loss = nn.MSELoss(outputs, labels) is
incorrectly calling the constructor of the MSELoss class with the
wrong arguments.

Instead of:

loss = nn.MSELoss (outputs, labels)   # wrong

you should either write:

loss = nn.MSELoss() (outputs, labels)

(note the extra pair of parentheses), or:

loss = F.mse_loss (outputs, labels)

Good luck.

K. Frank