Crossenthropy loss dimension out of range

Jake_Eum · February 8, 2021, 6:54am

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

target-> tensor([1., 0., 0., 1., 0., 1., 1., 0., 1., 1., 1., 1., 1., 0., 1., 1., 0., 0.,
0., 0., 1., 0., 0., 0., 1., 0.])
output-> tensor([-3.2691, -3.1722, -3.2648, -3.3598, -3.2569, -3.4140, -3.2098, -3.2161,
-3.4087, -3.2803, -3.1042, -3.1663, -3.3718, -3.3803, -3.1643, -3.2461,
-3.2690, -3.3987, -3.3615, -3.1216, -3.3198, -3.4017, -3.1110, -3.2741,
-3.1383, -3.1585], grad_fn=)

def train(epoch):
  model.train()
  for batch_idx, (data, target) in enumerate(train_loader):
    optimizer.zero_grad()
    output = model(data)
    print('target->',target[0])
    print('output->',output[0])
    criterion = nn.CrossEntropyLoss()
    loss = criterion(output[0],target[0])
    loss.backward()
    optimizer.step()
    if batch_idx % log_interval == 0:
      print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
        epoch, batch_idx * len(data), len(train_loader.dataset),
        100. * batch_idx / len(train_loader), loss.item()))
      train_losses.append(loss.item())
      train_counter.append(
        (batch_idx*64) + ((epoch-1)*len(train_loader.dataset)))
      torch.save(model.state_dict(), './result/model.pth')
      torch.save(optimizer.state_dict(), './result/optimizer.pth')

KFrank · February 8, 2021, 11:41pm

Hi Jake!

I see two issues:

First, pytorch (models, losses, etc.) works with batches of samples.
criterion (output[0], target[0]) looks like you are passing
a single sample from a batch to criterion. This will break things.

Second, the shapes of output and target don’t match properly for
use with CrossEntropyLoss. CrossEntropyLoss expects your
output to have shape [nBatch, nClass] and target to have
shape [nBatch] (no class dimension). target should consist of
integer class labels that run from 0 to nClass - 1.

One final comment: Your target looks like it is only 0s and 1s, so
it looks like you are working with a binary classification problem.
You can treat a binary problem as a two-class multi-class problem.
Doing so is fine, and makes perfect sense, but you might prefer
treating this explicitly as a binary problem and use something like
BCEWithLogitsLoss as your loss criterion.

Best.

K. Frank