Input size and target size mismatch

Hi,

I am working on regressing a score (positive real value) from images, and thus, the structure is almost identical to pytorch’s training a classifier example except for a few parts including the change from CrossEntropyLoss() to MSELoss(). The input size to CNN is [4, 2, 240, 240] where 4 is the batch size, 3 is the channel size, and 240x240 is the image size. The output from CNN (i.e., from outputs = net(inputs)) is [4,1] because I set the last linear layer as nn.Linear(84, 1) in order to receive a single value. I may be doing wrong. Please let me know if including the entire code might be helpful for this issue. Thanks!

UserWarning: Using a target size (torch.Size([4, 1, 1])) that is different to the input size (torch.Size([4, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)

If you are using nn.MSELoss, the output and target shape should be equal.
Currently your target has an additional dimension.
You could remove it via target = target.squeeze(1) or target = target.squeeze(2).
What do the dimensions mean in the target?

1 Like

Thanks @ptrblck for the answer! You are right that after fixing the target dimension, I am not getting the error anymore. The correct target dimension was [4, 1] (4: batch size, 1: single value) but I was sending with one more bracket (as followed the classification case) so it gave me additional dimension, which is wrong.

I have a follow-up question (let me know if I should create a new thread). Now at least the training code is running but I am getting loss of nan values from the beginning. My data is a set of black and white images (mostly white) and each one is associated with a single real value (e.g., 184). I am basically trying to learn a predictor to predict a single real value from a new black and white image. Do you have any thoughts on why I am getting nan values? I am including the code snippet here in case it could be helpful. Thanks!

train_data = Dataset(csv_file='data/train/data.csv',
                                           root_dir='data/train/',
                                           transform=transforms.Compose([
                                               ToTensor()
                                           ]))
trainloader = DataLoader(train_data, batch_size=4,
                        shuffle=True, num_workers=4)

net = Net()
criterion = nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

for epoch in range(2):

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.float()

        optimizer.zero_grad()
        outputs = net(inputs)

        loss = criterion(outputs.double(), labels)

        loss.backward()
        optimizer.step()
        running_loss += loss.item()

        if i % 20 == 19:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 20))
            running_loss = 0.0

I would recommend to check the input for invalid values via torch.isfinite.
Are you getting the NaN output directly in the first iteration?