[CNN IMAGE SEGMENTATION] RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of size: : [10, 3, 388, 388]

Hi there! I’m trying to code a CNN for a Rabbin Dataset and got some issues.

My dataset is looks like this: Example

I’m using the UNet Architecture, the input image is 572x572 and the output image is 388x388.

My Dataset class is like this:

learning_rate = 1e-3
batch_size = 10
epochs = 10

class MyDataset(Dataset):

  def __init__(self, root_dir, transform = None, transform2 = None):

    self.root_dir = root_dir
    self.names = os.listdir(os.path.join(self.root_dir, "Original"))
    self.transform = transform
    self.transform2 = transform2

  def __len__(self):
    return len(self.names)

  def classes(self):
    return torch.Tensor([0,1,2]) 

  def __getitem__(self, idx):

    if torch.is_tensor(idx):
      idx = idx.tolist()

    image_name = self.names[idx]
    image = io.imread(os.path.join(self.root_dir, "Original", image_name))

    label = io.imread(os.path.join(self.root_dir, "Ground Truth", image_name))

    if self.transform is not None:
      image = self.transform(image)

    if self.transform2 is not None:
      label = self.transform2(label)
        
    return image, label

transformations = transforms.Compose(
    [transforms.ToTensor(),
     transforms.CenterCrop(572)]
)

transformations2 = transforms.Compose(
    [transforms.ToTensor(),
     transforms.CenterCrop(388)]
)

And my training function:

def training(model, dataloader, lossFn, opt):

  model.train()

  totalTrainLoss = 0
  trainCorrect = 0

  for x, y in dataloader:
    x = x.to(device)
    y = y.to(device)

    pred = model(x)
    loss = lossFn(pred, y)

    opt.zero_grad()

    loss.backward()
  
    opt.step()
  
    totalTrainLoss += loss
    trainCorrect += (pred.argmax(1) == y).type(torch.float).sum().item()
  
  return totalTrainLoss, trainCorrect

I’m using this as an optimizer and loss funciton:

opt = Adam(model.parameters(), lr = learning_rate)
lossFn = nn.NLLLoss()

I call the training function on a for loop to run the epochs.:

for e in range(0, epochs):

  (totalTrainLoss, trainCorrect) = training(model, train_dataLoader, lossFn, opt)
  (totalValLoss, valCorrect) = validation(model, train_dataLoader, lossFn)
  
  avgTrainLoss = totalTrainLoss / train_Steps
  avgValLoss = totalValLoss / val_Steps
  
  trainCorrect = trainCorrect / len(train_dataLoader.dataset)
  valCorrect = valCorrect / len(val_dataLoader.dataset)

  print("[INFO] EPOCH: {}/{}".format(e + 1, epochs))
  print("Train loss: {:.6f}, Train accuracy: {:.4f}".format(avgTrainLoss, trainCorrect))
  print("Val loss: {:.6f}, Val accuracy: {:.4f}\n".format(avgValLoss, valCorrect))

But everytime it gives the same error message on the loss = lossFn(pred, y) : “RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of size: : [10, 3, 388, 388]”. How can I solve it?

Any help would be deeply appreciated! :slight_smile:

Your target should have the shape [batch_size, height, width] and contain class indices in the range [0, nb_classes-1] for a multi-class segmentation use case as seen here:


criterion = nn.NLLLoss()

batch_size = 2
height = width = 24
nb_classes = 10
output = torch.randn(batch_size, nb_classes, height, width, requires_grad=True)

# works
target = torch.randint(0, nb_classes, (batch_size, height, width))
loss = criterion(output, target)

# fails
target = torch.randint(0, nb_classes, (batch_size, nb_classes, height, width))
loss = criterion(output, target)
# RuntimeError: only batches of spatial targets supported (3D tensors) but got targets of dimension: 4

while your current target has 4 dimensions.

Hello @ptrblck, thanks for your reply. But I still didn’t get it how to shrink my target, which in my case is the variable y, to these 3 dimensions only.

I forgot to say that my cnn segments 3 regions.

If I write:

for e in range(0, epochs):
   
   model.train()
   
   
   totalTrainLoss = 0
   totalValidLoss = 0
   
   
   trainCorrect = 0
   validCorrect = 0
   
   # Loop over training set
   for i, batch in enumerate(train_dataLoader):

       x, y = batch

       x, y = x.to(device), y.to(device)

       print(y.size())
       opt.zero_grad()
       
       pred = model(x)

       y = torch.randint(0, 3, (batch_size, 388, 388))


       loss = lossFn(pred, y)
       
       loss.backward()
       opt.step()
       
       totalTrainLoss += loss
       trainCorrect += (pred.argmax(1)==y).type(torch.float).sum().item()

The error message dissapears, but, I don’t understand one thing: y will still be my label by calling it as y = torch.randint(0, 3, (batch_size, 388, 388))? It will still be the same label imported in my dataset?

No, you are creating y with random values in [0, 2] now which are not your true class targets.
My code snippet uses random tensors to show the needed shape and value range since I do not have access to your real dataset.
As previously described: the target should have the the shape [batch_size, height, width] and contain class indices in the range [0, nb_classes-1] for a multi-class segmentation use case. If your true target still has 4 dimensions, you might be using a one-hot encoded target. In that case, use target = torch.argmax(target, dim=1) to create the expected target tensor.

1 Like

@ptrblck Thank you. It solved the error message!