Nll_loss error: "only batches of spatial targets supported, but got targets of dimension 1"

imaginary_z · September 28, 2019, 10:44pm

Hi all,

I’m attempting to train a network on a large dataset of 64 x 64 images (~100,000 in total) using a simple ReLu model. I’m very new to both neural networks and PyTorch in general, so please bear with me.

My network appears to initialize and evaluate properly – my error arises when I attempt to call the nll_loss function in my validation method.

Here is the structure of my network:

# create neural net
import torch.nn as nn

import torch.optim as optim

class Net(nn.Module):
  def __init__(self):
    super(Net, self).__init__()
    self.fc1 = nn.Linear(64, 100) 
    self.fc2 = nn.Linear(100, 50)
    self.fc3 = nn.Linear(50, 10)
    
  def forward(self, x):
    # x = x.view(x.size(0), -1)
    x = F.relu(self.fc1(x))
    x = F.relu(self.fc2(x))
    x = self.fc3(x)
    return F.log_softmax(x)

And here is my validation method:

#validate net

def validation():
  network.eval()
  validation_loss = 0
  correct = 0
  with torch.no_grad():
    for data, target in validation_loader:
      output = network(data)
      validation_loss += F.nll_loss(output, target, size_average=False).item() #ERROR IS HERE, ISSUE WITH TARGET DIMENSION
      pred = output.data.max(1, keepdim=True)[1]
      correct += pred.eq(target.data.view_as(pred)).sum()
  validation_loss /= len(validation_loader.dataset)
  validation_losses.append(validation_loss)
  print('\nValidation set: Avg. loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
    validation_loss, correct, len(validation_loader.dataset),
    100. * correct / len(validation_loader.dataset)))

The precise details of the error I get are:

RuntimeError                              Traceback (most recent call last)
<ipython-input-17-fe7e33c67fb7> in <module>()
----> 1 validation()
      2 for epoch in range(n_epochs):
      3   train(epoch)
      4   validation()
      5 test()

1 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   1871         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   1872     elif dim == 4:
-> 1873         ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   1874     else:
   1875         # dim == 3 or dim > 4

RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:59

I checked the dimensions of my data and target variables, and they are [4096, 3, 64, 64] and [4096], respectively. I can see that my target variable is supposed to be 3D instead of 1D, but I’m at a loss for how to change it. In the tutorials I’ve followed for setting up this type of loss computation there was little in the way of transforming or augmenting these sorts of things.

If any of you have any tips or ideas on what’s happening here, I would really appreciate it.

Thanks!

phan_phan · September 29, 2019, 5:25pm

Hi there,
If the shape of data is [4096, 3, 64, 64], and if self.fc1 = nn.Linear(64, 100), then self.fc1(data) will consider the 64 rows of each image as 64 different feature vectors, and will output a tensor of size [4096, 3, 64, 100], transforming the 64 rows to 64 feature vectors of size 100.

Then, the tensor output might be shaped [4096, 3, 64, 10], which is probably the source of the error.

Did you want to flatten your data images instead ?

DoubleEleven1111 · March 6, 2020, 1:55pm

Hi, I have the same problem and I don’t know how to solve it. Did you solve the problem?