CrossEntropyLoss giving error of dimension

Sax-Ted · July 1, 2022, 5:59pm

Hello,

I was working on my research of image classification.
Here’s the code (part in a jupyter cell):

# Training loop
num_total_steps = len(Train_loader.dataset)
step = 0
losses = []
accuracies = []
steps = []
for epoch in range(num_epochs):
    train_loss = 0.0
    train_acc = 0.0
    for i , (images, labels) in enumerate(Train_loader):
        images = images.to(device)
        # print(len(images))
        labels = labels.to(device)
        # print(labels)
        
        #forward
        outputs = model(images)
        # print(outputs)
        loss = criterion(outputs, labels)
        losses.append(loss.item())
        
        
        # backwards and optimizer
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # Calculating running training accuracies
        _, predictions = outputs.max(1)
        print((predictions))
        num_correct = (predictions == labels).sum()
        running_train_acc = float(num_correct)/float(images.shape[0])
        accuracies.append(running_train_acc)
        
        train_acc += running_train_acc
        train_loss += loss.item()
        
        avg_train_acc = train_acc / len(Train_loader)
        avg_train_loss = train_loss / len(Train_loader)
        
        writer.add_scalar('Training Loss', loss, global_step= step)
        writer.add_scalar('Training Accuracy', running_train_acc, global_step=step)
        step += 1
        steps.append(step)
        
        
        if (i+1) % 10 == 0:
            print ('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}' 
                   .format(epoch+1, num_epochs, i*len(images), num_total_steps, loss.item()))
            
        torch.cuda.empty_cache()
            
print('Training Ended')

And here’s the error I get:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_34/669514399.py in <module>
     17         outputs = model(images)
     18         # print(outputs)
---> 19         loss = criterion(outputs, labels)
     20         losses.append(loss.item())
     21 

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
   1119     def forward(self, input: Tensor, target: Tensor) -> Tensor:
   1120         return F.cross_entropy(input, target, weight=self.weight,
-> 1121                                ignore_index=self.ignore_index, reduction=self.reduction)
   1122 
   1123 

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction)
   2822     if size_average is not None or reduce is not None:
   2823         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2824     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2825 
   2826 

RuntimeError: 1only batches of spatial targets supported (3D tensors) but got targets of size: : [12]

So then I tried to print the shape of outputs and labels.
The shape of outputs: torch.Size([12, 2, 224, 224])
The shape of labels: torch.Size([12])

The shape looks alright since the labels variable is for the classification label.
tensor([1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1], device='cuda:0')

Is anyone familiar with this problem? Thanks!

Sax-Ted · July 1, 2022, 6:05pm

Almost forgot to provide my loss criterion and optimizer

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr = learning_rate)
writer = SummaryWriter(f'runs/CNN/Plotting_on_tensorBoard')

Andrei_Cristea · July 1, 2022, 6:52pm

Hi Ted -

Your post nicely formats the error, which makes it easy for others to help, nice work!

Your outputs need to be shaped like n_batches x n_labels, since they correspond to, for each image (n_batches) the “score” assigned to each label (n_labels). Looks like your outputs is shaped like n_batches x n_labels x height x width, which is what’s causing the issue.

If you meant to have one label per image, then you need to change your model to output something shaped like n_batches x n_labels. This is usually done by including a fully connected layer with the desired output number of features (namely, 1) after the convolutional layers.

Hope this helps!
Andrei

Sax-Ted · July 2, 2022, 1:54am

Hi Andrei,
Thanks for helping!

I added nn.Linear to my model,
Here’s the model code:

class ConvNet2(nn.Module):

    def __init__(self):
        super().__init__()
        
        self.conv2 = nn.Sequential(
            # dconv_down1
            nn.Conv2d(3, 64, 3, stride=1, padding=1),
            nn.BatchNorm2d(64),
            nn.LeakyReLU(inplace=True),
            nn.Conv2d(64, 64, 4, stride=2, padding=1),
            nn.BatchNorm2d(64),
            nn.LeakyReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1, padding = 1),
            # dconv_down2
            nn.Conv2d(64, 128, 4, stride=2, padding=1),
            nn.BatchNorm2d(128),
            nn.LeakyReLU(inplace=True),
            nn.Conv2d(128, 128, 4, stride=2, padding=1),
            nn.BatchNorm2d(128),
            nn.LeakyReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1, padding = 1),
            # dconv_down3
            nn.Conv2d(128, 256, 4, stride=2, padding=1),
            nn.BatchNorm2d(256),
            nn.LeakyReLU(inplace=True),
            nn.Conv2d(256, 256, 4, stride=2, padding=1),
            nn.BatchNorm2d(256),
            nn.LeakyReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=1, padding = 1),
            # dconv_down4
            nn.Conv2d(256, 512, 4, stride=2, padding=1),
            nn.BatchNorm2d(512),
            nn.LeakyReLU(inplace=True),
            nn.Conv2d(512, 512, 4, stride=2, padding=1),
            nn.BatchNorm2d(512),
            nn.LeakyReLU(inplace=True),
        )
        
        self.fc = nn.Sequential(
            nn.Linear(512*7*7, 256),
            nn.Linear(256, 128),
            nn.Linear(128, 64),
            nn.Linear(64, 2)
        )

    def forward(self, x):
        out = self.conv2(x)
        print(out.shape)
        out = out.reshape(out.size(0),-1)
        print(out.shape)
        out = self.fc(out)
        return out

However, I get this error

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_33/32309055.py in <module>
      2 model = ConvNet2()
      3 x = torch.rand(10,3, 224,224)
----> 4 print(model(x).shape)
      5 output = model(x)
      6 print(output)

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/tmp/ipykernel_33/2822100006.py in forward(self, x)
    107         out = out.reshape(out.size(0),-1)
    108         print(out.shape)
--> 109         out = self.fc(out)
    110         return out

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
    137     def forward(self, input):
    138         for module in self:
--> 139             input = module(input)
    140         return input
    141 

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

/opt/conda/lib/python3.7/site-packages/torch/nn/modules/linear.py in forward(self, input)
     94 
     95     def forward(self, input: Tensor) -> Tensor:
---> 96         return F.linear(input, self.weight, self.bias)
     97 
     98     def extra_repr(self) -> str:

/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py in linear(input, weight, bias)
   1845     if has_torch_function_variadic(input, weight):
   1846         return handle_torch_function(linear, (input, weight), input, weight, bias=bias)
-> 1847     return torch._C._nn.linear(input, weight, bias)
   1848 
   1849 

RuntimeError: mat1 and mat2 shapes cannot be multiplied (10x512 and 25088x256)

when I use this piece of code to check the output dimension.

model = ConvNet2()
x = torch.rand(10,3, 224,224)
print(model(x).shape)
output = model(x)
print(output)

I think the problem should be at the nn.Linear, but I have no clue what’s wrong with it ;-;
Thanks

Sax-Ted · July 2, 2022, 2:16am

Fixed it with modifying nn.Linear(512*7*7, 256) to nn.Linear(512, 256).
Thank you Andrei!