RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:61

video_analytics · January 6, 2020, 3:11pm

Hi

I am getting this error RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:61

whenever I used my saved model for transfer learning

Can anyone please help me out with this problem ?

Thanks in advance

ptrblck · January 7, 2020, 5:14am

This error is usually raised, when you try to pass a spatial model output to a non-spatial target.
Have a look at this small code to reproduce the issue:

# Reproduced error
N, nb_classes, H, W = 1, 10, 24, 24
data = torch.randn(N, nb_classes, H, W, requires_grad=True)
target = torch.randint(0, nb_classes, (N,))

criterion = nn.CrossEntropyLoss()
loss = criterion(data, target) 

# Fixed by spatial target
target = torch.randint(0, nb_classes, (N, H, W))
loss = criterion(data, target)

The error is fixed once you provide a spatial target as [N, H, W].

video_analytics · January 7, 2020, 10:01am

Implementation of CNN/ConvNet Model using PyTorch (depicted in the picture above)

class CNN(torch.nn.Module):

def __init__(self):
    super(CNN, self).__init__()
    # L1 ImgIn shape=(?, 28, 28, 1)
    #    Conv     -> (?, 28, 28, 32)
    #    Pool     -> (?, 14, 14, 32)
    self.layer1 = torch.nn.Sequential(
        torch.nn.Conv2d(3, 32, kernel_size=3, stride=1, padding=1),
        torch.nn.ReLU(),
        torch.nn.MaxPool2d(kernel_size=2, stride=2),
        torch.nn.Dropout(p=1 - keep_prob))
    # L2 ImgIn shape=(?, 14, 14, 32)
    #    Conv      ->(?, 14, 14, 64)
    #    Pool      ->(?, 7, 7, 64)
    self.layer2 = torch.nn.Sequential(
        torch.nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
        torch.nn.ReLU(),
        torch.nn.MaxPool2d(kernel_size=2, stride=2),
        torch.nn.Dropout(p=1 - keep_prob))
    # L3 ImgIn shape=(?, 7, 7, 64)
    #    Conv      ->(?, 7, 7, 128)
    #    Pool      ->(?, 4, 4, 128)
    self.layer3 = torch.nn.Sequential(
        torch.nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
        torch.nn.ReLU(),
        torch.nn.MaxPool2d(kernel_size=2, stride=2, padding=1),
        torch.nn.Dropout(p=1 - keep_prob))

    # L4 FC 4x4x128 inputs -> 625 outputs
    self.fc1 = torch.nn.Linear(4 * 4 * 8192, 240, bias=True)
    torch.nn.init.xavier_uniform(self.fc1.weight)
    self.layer4 = torch.nn.Sequential(
        self.fc1,
        torch.nn.ReLU(),
        torch.nn.Dropout(p=1 - keep_prob))
    # L5 Final FC 625 inputs -> 10 outputs
    self.fc2 = torch.nn.Linear(240, 2, bias=True)
    torch.nn.init.xavier_uniform_(self.fc2.weight) # initialize parameters

def forward(self, x):
    out = self.layer1(x)
    out = self.layer2(out)
    out = self.layer3(out)
    out = out.view(out.size(0), -1)   # Flatten them for FC
    out = self.fc1(out)
    out = self.fc2(out)
    return out

instantiate CNN model

model = CNN()
model

video_analytics · January 7, 2020, 10:03am

I have used the above CNN network for training two classes and then I want to reuse the trained network for training other two classes.

The first training part run successfully and I saved that model file but when I loading the trianed network and reusing it gives me the same run time error.

ptrblck · January 8, 2020, 7:07am

Could you print the shapes of your model output as well as the target before passing them to your criterion, please?
I still think the error is due to a shape mismatch in your target tensor.

video_analytics · January 8, 2020, 9:55am

    Layer (type)               Output Shape         Param #

================================================================
Conv2d-1 [128, 32, 250, 250] 896
ReLU-2 [128, 32, 250, 250] 0
MaxPool2d-3 [128, 32, 125, 125] 0
Dropout-4 [128, 32, 125, 125] 0
Conv2d-5 [128, 64, 125, 125] 18,496
ReLU-6 [128, 64, 125, 125] 0
MaxPool2d-7 [128, 64, 62, 62] 0
Dropout-8 [128, 64, 62, 62] 0
Conv2d-9 [128, 128, 62, 62] 73,856
ReLU-10 [128, 128, 62, 62] 0
MaxPool2d-11 [128, 128, 32, 32] 0
Dropout-12 [128, 128, 32, 32] 0
Linear-13 [128, 240] 31,457,520
Linear-14 [128, 240] 31,457,520
Linear-15 [128, 2] 482

Total params: 63,008,770
Trainable params: 63,008,770
Non-trainable params: 0

Input size (MB): 91.55
Forward/backward pass size (MB): 8533.91
Params size (MB): 240.36
Estimated Total Size (MB): 8865.82

video_analytics · January 8, 2020, 9:56am

this is overall model shape so I am intially passing images of [3, 250, 250] with 2 classes and once tge model is trained I want to retrain it using smaller amount of dataset but it is giving me error…

DoubleEleven1111 · March 7, 2020, 4:16am

Hi, I’m a new learner and I have a same question, the error is: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1. My target size is [128], my predict data size is [128, 256, 6, 1000]. I noticed your solution which is provide a spatial target as [N, H, W] . But I don’t know how to do it . I’ve google it for many times but I still didn’t find a way to do it. Can you give me some examples about how to provide a spatial target as [N, H, W] . Thanks a lot.

ptrblck · March 7, 2020, 5:52am

You shouldn’t simply reshape your target to a spatial shape, if you have scalar values.
Instead of changing the target, you should change your model architecture to return logits in the right shape.
Based on the provided target shape (multi-class classification), your model should return logits in the shape [batch_size, nb_classes].

DoubleEleven1111 · March 7, 2020, 2:09pm

Thank you! It works😉

RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) but got targets of dimension: 1 at /pytorch/aten/src/THNN/generic/SpatialClassNLLCriterion.c:61

Implementation of CNN/ConvNet Model using PyTorch (depicted in the picture above)

instantiate CNN model

Total params: 63,008,770 Trainable params: 63,008,770 Non-trainable params: 0

Total params: 63,008,770
Trainable params: 63,008,770
Non-trainable params: 0