Code in 0.4.0 returns empty while 0.3.0 runs normally

Hi All, I would like to report a strange bug(?) for the version 0.4.0. Basically, I have a network to produce per-voxel segmentation results. While I was using the 0.3.0 version, the network is trained normally (segmentation testing results are good). After I happened to switch to the 0.4.0 with the exact same code (of course same data as input as well), I got empty return values for the loss (cross entropy loss). Main codes are as following:

loss_main = cross_entropy_3d(output, label_batch)
# in 0.3.0, the loss_main returns normally; however in 0.4.0, the loss_main returns nothing.

def cross_entropy_3d(x, y):
  n, c, _, _, _ = x.size()
  # x_t : (N*X*Y*Z, C)
  x_t = torch.transpose(torch.transpose(torch.transpose(x, 1, 2), 2, 3), 3, 4).contiguous().view(-1, c)
  y_t = torch.transpose(torch.transpose(torch.transpose(y, 1, 2), 2, 3), 3, 4).contiguous().view(-1).long()
  loss = F.cross_entropy(x_t, y_t)
  #loss = loss / x_t.size()[0]
  return loss

what do you mean by ‘empty’? Do you have some short piece of code (eg using random numbers) we can run to reproduce it?

Hi. Crossentropy can support any size of input. Have you try the code?

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
criterion = nn.CrossEntropyLoss().to(device)
loss = criterion(output, label_batch)  

Try to run the following script example in 0.4.0 and 0.3.0. In 0.3.0, this script runs normally and the returned loss_main is in torch.Size([1]). However, in 0.4.0, the returned loss_main is in torch.Size([]). Maybe 0.4.0 has some usage updates that I am not aware of.

import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import torch.nn.functional as F
#from pycrayon import CrayonClient

def cross_entropy_3d(x, y, w=None):
  n, c, _, _, _ = x.size()
  # x_t : (N*X*Y*Z, C)
  x_t = torch.transpose(torch.transpose(torch.transpose(x, 1, 2), 2, 3), 3, 4).contiguous().view(-1, c)
  y_t = torch.transpose(torch.transpose(torch.transpose(y, 1, 2), 2, 3), 3, 4).contiguous().view(-1).long()
  loss = F.cross_entropy(x_t, y_t, weight=w)
  #loss = loss / x_t.size()[0]
  return loss


# output = torch.rand(5, 2, 3, 3, 3)
# label_batch = torch.randint(0, 2, (5, 1, 3, 3, 3), dtype=torch.long)
# output, label_batch = Variable(output.cuda()), Variable(label_batch.cuda())
output = np.random.randn(5, 2, 3, 3, 3)
label_batch = np.random.randint(2, size=(5, 1, 3, 3, 3))
output, label_batch = Variable(torch.from_numpy(output).cuda()), Variable(torch.from_numpy(label_batch).cuda())

loss_main = cross_entropy_3d(output, label_batch)
print(loss_main.shape)
print(loss_main.cpu().data.numpy()[0])

torch.Size([]) means that it is a scalar, not empty! :slight_smile: In 0.4 we introduced proper scalar (i.e. 0-dimensional tensors) support, so that’s why it has “empty” shape. But it actually contains a value!

Hi, thank you for your mention. It turns out that I misunderstood the 0-dimensional tensors in 0.4.0.

BTW,

The CroosEntropoy loss begins to support any size of input from 0.4.0.
https://pytorch.org/docs/stable/nn.html?highlight=crossentropyloss#torch.nn.CrossEntropyLoss

In 0.3.0, the input of CrossEntropy loss has to be a 2D tensor of size. https://pytorch.org/docs/0.3.0/nn.html?highlight=crossentropyloss#torch.nn.CrossEntropyLoss

Oh, I see! Thanks a lot!