MSELoss inquiry regarding size of inputs

When you define the loss function has MSE,

i.e.,

cost_func = nn.MSELoss()

and then you want to compute the loss using

loss = cost_func(predicted_labels, training_labels)

Would it be an issue if my predicted_labels tensor size is torch.Size([1500, 1, 1]) and the training_labels size is torch.Size([1500])?

The target tensor should be automatically broadcasted to the necessary shape:

criterion = nn.MSELoss()
criterion(torch.randn(10, 1, 1), torch.randn(10))

Hm it seem that I am getting different answers when I have different shapes vs. same shape.

Try this code

import torch
from torch.autograd import Variable
import numpy as np
import torch.nn as nn

torch.manual_seed(7)
a = torch.Tensor((1,2,3))
b = torch.Tensor((4,5,6))

criterion = nn.MSELoss()
print(criterion(a,b))

a = torch.Tensor((1,2,3))
a = a.reshape(3,1)
b = torch.Tensor((4,5,6))

criterion = nn.MSELoss()
print(criterion(a,b))

The output is

torch.Size([3])

torch.Size([3])

tensor(9.)

torch.Size([3, 1])

torch.Size([3])

tensor(10.3333)

Yes, the results will differ and I guess you would like to get the first result.
In your second example, a will be broadcasted if you call (a-b) to:

tensor([[-3., -4., -5.],
        [-2., -3., -4.],
        [-1., -2., -3.]])

, i.e. each row of a will be subtracted by b.